Crimson Reason: High Performance Computing

Showing posts with label High Performance Computing. Show all posts

Monday, January 20, 2025

U.S. Tries to Govern AI’s Global Spread

From the Carnegie Endowment for International Peace:

With Its Latest Rule, the U.S. Tries to Govern AI’s Global Spread | Carnegie Endowment for International Peace

Tuesday, April 28, 2020

Russian Driverless Truck

News from RT:

Driverless Truck - https://www.rt.com/russia/487077-kamaz-driverless-trucks-arctic/

Monday, March 23, 2020

COVID-19: Folding@Home

News of Folding@Home:

https://www.tomshardware.com/news/folding-at-home-worlds-top-supercomputers-coronavirus-covid-19

Sunday, March 28, 2010

J2SE 5 very much extends and expands the concurrency model of Java. The very useful concurrency communication classes such as (bounded and unbounded) queues are introduced.

One of the best parts of new improved Java concurrency model is the new java.util.concurrent.BlockingQueue interface.

LinkedBlockingQueues and ArrayBlockingQueues do away with wait(), notify() and notifyAll() style of coding. Add to that Synchronized queues, which are simply amazing, as well as DelayQueue classes and PriorityBlockingQueues and you have yourself a whole lot of threading fun!

Other useful classes are CountDownLatch, CyclicBarrier, Executor,
CachedThreadPools, FixedThreadPools, SingleThreadExecutors, ExecutorServices, Futures, and Callable.

I think it is a good idea to look into learning synchronous, asynchronous queues, and cached threading pools, and then delve into performance considerations for atomic, lock, and concurrent methods of class, method, and attribute concurrency.

Note that there are command line switches important to concurrent programming (-X*) and also that there Runtime classes designed to help avoid OutOfMemory heap problems.

Some of these exciting stuff is covered in the "Thinking in Java 4" by Bruce Eckel but that book does not concentrate on concurrent programming.

The book to read is:

Java Concurrency in Practice
by Brian Goetz, Tim Peierls, Joshua Bloch, Joseph Bowbeer, David Holmes, Doug Lea

Wednesday, January 20, 2010

Mitrion-C

Mitrionics AB (http://www.mitrionics.com/ ) has released Mitrion-C for writing parallel code. This is a new language that uses "C" syntax and is optimized for parallel compilation on FPGA as well as multi-core processors. See the Company announcement @ http://www.mitrionics.com/?page=view_release&itemid=9947&back=mitrionics-sc09-press-kit

The language details may be found @ http://www.science.uwaterloo.ca/~hmerz/FPGA/The%20Mitrion-C%20Programming%20Language.pdf

Saturday, October 3, 2009

Threading References

From the article “Three Sides of Threading” by Claire Cates in the October 2009 issue of the Software Test & Performance Magazine:

Developing Multithreaded Applications: A Platform Consistent Approach @
http://cache-www.intel.com/cd/00/00/05/15/51534_developing_multithreaded_applications.pdf

Multithreading Programming Guide @ http://docs.sun.com/app/docs/doc/816-5137

Dr. Dobbs: Performance Analysis and Multicore Processors : March 30, 2006 @ http://www.ddj.com/dept/64bit/184417069

Dr Dobbs: Multitasking Alternatives and the Perils of Preemption: Sept 14, 2006 @ http://www.ddj.com/dept/embedded/193000965

Data Placement in Threaded Programs: from the Intel Software Network Threading Methodology: Principles and Practices @
http://cache-www.intel.com/cd/00/00/21/93/219349_threadingmethodology.pdf

Intel Software Developer Webinar Series, Fall 2007

W. Brown, R. Malveau, H. McCormick III, T. Mowbray, “AntiPatterns: Refactoring Software, Architectures, and Projects in Crisis,” John Wiley, New York (1998)

C. Smith, L.Williams, “More New Software AntiPatterns,” Proceedings of 2003 Computer Measurement Group

Friday, September 18, 2009

RapidMind

RapidMind is a multi-core development platform that supports Streaming SIMD Extensions. It was recently bought by Intel. Learn more about it @ http://www.rapidmind.com/.

Cilk++

Cilk++ is three keywords and a runtime system that extend C++ to the realm of multicore (or parallel) programming. Cilk++ is platform-independent. It was recently bought by Intel. Learn more about it @ http://www.cilk.com/multicore-products/cilk-solution-overview/

Tuesday, July 7, 2009

Axum

New toys from Bill's boys er Steve's boys @ http://msdn.microsoft.com/en-us/devlabs/dd795202.aspx

Thursday, June 25, 2009

WhisperStation-PSC

WhisperStation-PSC is a personal supercomputer featuring Tesla GPUs and the CUDA SDK. It contains up to four Tesla C1060 GPUs, two AMD or Intel processors, 64 GB memory and optional RAID storage.

The supercomputer is designed for computationally-intensive OpenMP applications that employ hundreds of parallel threads to speed up floating point operations, such as industrial design, financial modeling, space exploration, CFD simulation and medical imaging.

Learn more @ www.microway.com

Saturday, June 13, 2009

HIPerWall

HIPerWall is a wall built of numerous high-definition monitors, each with its own imbedded computer for displaying standard and large (up to one gigabyte or larger) graphic images, high-definition (HD) digital movies, and streaming content from video cameras and other live feeds.

Built at Calit2 (California Institute for Telecommunications and Information Technology) at UC Irvine, the Highly Interactive Parallelized Display Wall (hence HIPerWall) has the ability to display 200 Megapixel images.

It is designed to visualize enormous data sets and allows viewers to see detail, with 100 dots per inch on the screens, while retaining the context of an overview by seeing surrounding data (also in high detail). This allows a group of scientists to collaborate, share detailed information, while still keeping the big picture.

Learn more @ http://hiperwall.calit2.uci.edu/

Thursday, June 11, 2009

More on Parallel Processing

I found the various comments posted at the GCN Government Computer News blog (http://gcn.com/Blogs/Tech-Blog/2009/06/New-parallel-processing-languages.aspx ) quite insightful.

Take a look!

Thursday, April 16, 2009

Introducing the FAWN

This is a "cluster architecture for providing fast, scalable, and power-efficient key-value storage. A FAWN links together a large number of tiny nodes built using embedded processors and small amounts (2--16GB) of flash memory into an ensemble capable of handling 700 queries per second per node, while consuming fewer than 4 watts of power per node."

Tuesday, February 3, 2009

Approaches to Parallelism

A new model for writing concurrent applications is based on a structure called an actor. An actor is a computation entity whose primary actions are performing operations that are passed to it, passing data to other actors, and creating new actors.

These operations are entirely local; they have no effects on other actors. To affect other actors, an actor must pass a message to them, including the data they need or the new instructions.
This parallel-programming technique does not use variable but constants – like Java strings - as data cannot be unexpectedly changed by another thread.

Data changes among actors are done by sending messages to one another. Because of this built-in mutual exclusion, actors are a good match for parallelism

Erlang is the language with the greatest commercial acceptance that uses actor-like constructs. For Java, there is an actor framework called ActorFoundry. But, on the JVM, the best choice is the emerging language Scala, which provides support for traditional object-oriented-style programming as well.

This approach is similar to dataflow, a design that was first expounded in the 1960s. It too uses message passing and adds a built-in capability to monitor the relationship between two data items, such that if one changes, the other is automatically updated.

Pervasive Software is releasing DataRush Java library which has handled massive amounts of data in tests with only modest hardware platforms. The product has the ability to leverage data-flow across all the processor cores.

It seems that message-passing parallelism (think OpenMP) is likely to become more prominent during the next few years as a way to leverage the many cores in today’s PCs and servers.

Saturday, October 11, 2008

Amazon EC2 API

The Amazon Elastic Compute Cloud (Amazon EC2) web service API gives on the ability to execute arbitrary applications in the Amazon computing environment.

EC2 API has the momentum to become the x86 "chipset" of Cloud Computing.

Learn more about it @ http://docs.amazonwebservices.com/AmazonEC2/gsg/2006-06-26/

Tuesday, August 19, 2008

Free .Net Deadlock Debugging Tool

Corneliu, creator of the GUI debugging tool HawkEye, has a debugging tool that works against most .NET programs, without recompiling them. The follwoing are its features:

1. The tool does not require to have the code re-compiled in any way or form, with any external dependencies, nor reference any external library or have you modify your code to use any special type of locks inside your code.
2. It works on release builds with no PDB files.
3. It works on running processes or previously captured memory dumps.
4. It detects deadlocks across multiple threads and returns detailed call-stack and lock usage information.
5. It only detect deadlocks in which threads are actively waiting for locks acquired by other threads.
6. It does not detect the dining philosophers problem or deadlocks created in combination of time waits + wake/check + lock.
7. It has an external dependency on the cdb.exe (part of the the free Debugging Tools for Windows package from Microsoft).
8. It requires absolutely no installation. It an xcopy deployment.
9. And best of all it’s free (source code to be published soon)

The tool is discussed @ http://www.acorns.com.au/blog/?p=129 and may be obtained @ http://www.acorns.com.au/files/ACorns.Debugging.DeadlockTests.1.0.1.zip

Saturday, July 26, 2008

Lilly Science Grid

Eli Lilly has made its Discovery IT platform, known internally as the Lilly Science Grid (LSG), open source.

LSG is a plug-in hosting and deployment framework that sits on top of Microsoft’s Composite Application Block. It is a rich client that requires .NET 2.0 or higher.

The framework simplifies the task of creating new plug-ins by providing a Visual Studio template from which developers can quickly learn and expand. Users can choose which applications and plug-ins to use within an integrated environment.

Find it @ http://sourceforge.net/projects/lsg

Thursday, April 17, 2008

Actor Model of Concurrency

Erlang implements the Actor Model of concurrency. Learn more @ http://en.wikipedia.org/wiki/Actor_model and @ http://www.ccs.neu.edu/home/tov/erlang-talk/

AMD CTM

This is a free interface for interacting with AMD's GPU products (similar to NVDIA's CUDA). It is available @ http://ati.amd.com/companyinfo/researcher/resources.html

Thursday, April 10, 2008

NVIDIA Tesla C870

NVIDIA Tesla C870 is a Graphics Processor Unit (GPU) designed for high performance computing (HPC) solutions. It brings supercomputing power to any workstation or server and to standard, CPU-based server clusters.

Key elements include:
massively multi-threaded architecture with a 128-processor computing core
A C-language development environment for the GPU (called CUDA)
A suite of developer tools (C-compiler, debugger, performance profiler, optimized libraries)
Seamlessly fits into existing HPC environments

The CUDA programming guide is a good place to start. You can get it @ http://developer.nvidia.com/object/cuda.html

Then download the SDK. The sample code are a great learning resource.Also go through the UIUC course on parallel programming with special focus on CUDA @ http://courses.ece.uiuc.edu/ece498/al1/Syllabus.html

The SDK indicates simple ways to integrate C++ and CUDA