Services & Support
For new accounts click here
HCC employs application specialists for more extensive project development. For training, proposal development, project assistance with scientific computing or other issues contact:
Dr. David Swanson
Director of HCC
When a computation process requires massive reads and writes (I/O), regular file systems are generally the bottleneck in such processes. This problem gets worse when the running process is a parallel program. In order to support such demanding applications a Parallel file system from Panasas Inc. is available at HCC. The total storage capacity of Panasas file system at HCC is 140TB and it is divided into two volumes.
The Hadoop Distributed File System (HDFS) provides replicated data storage which tries to keep different copies of the data on different racks. HDFS provides an economical storage platform with high availability. At HCC HDFS is primarily used for managing and storing multiple copies of data across the clusters.
At HCC we can manage dedicated disk arrays/disk storage systems that contain multiple disk drives. If you would like to store large datasets for extended period of time we can help you implement that solution.
Condor is a high-throughput distributed parallelization computing software framework which may be used for dedicated or opportunistic workload management. At HCC condor is widely used for managing opportunistic workloads along with GlideinWMS to farm jobs to the Open Science Grid (OSG). We are working on a project which will scavenge idle cycles from student lab machines and bring them in a pool for research use.
The primary scheduler used at HCC is based on the TORQUE Resource Manager along with Maui Cluster Scheduler. To submit jobs on Firefly or Sandhills look at the Maui FAQ maintained by HCC.
Hadoop is an Open Source implementation of Google's Map reduce framework and distributed file system (HDFS). HCC has primarily utilized HDFS to support very large datasets for the CMS project. Various researchers have utilized Map Reduce for the analysis of large datasets.
HCC Cloud is an OpenStack (http://www.openstack.org/) implementation currently being developed with selected collaborators. It allows users to create and manage virtual machines with particular characteristics when more traditional approaches are not appropriate to solve their problems.
Red is a 215 node/1140 core cluster computer purchased to help analyze data for the CMS particle physics experiment and is currently deployed on the open science grid .
The Firefly Cluster consists of 1,151 nodes, 280 of which run two AMD Quad core processors, the other 871 nodes run two dual core Opteron processors.
Sandhills has 1440 AMD cores housed in a total of 44 nodes, it was constructed in 2011. The calculated HPL benchmark for Sandhills is approx. 9416 GFlops.
MPI stands for Message Passing Interface, the in practice standard for tightly coupled large scale cluster computing. These libraries are routinely called from C or Fortran codes.
OpenMP is an API for programming shared memory architectures which is growing in importance as core counts continue to increase.
Compute Unified Device Architecture (CUDA) was developed by NVIDIA to allow programs to be ported to General Purpose Graphics Processors Units (GPGPU).
HCC is affiliated with the Open Science Grid and the XSEDE project. Grid protocols allow HCC users to access international resources.