Computational science and all its applied forms, statistics, Deep Learning, computational finance, computational biology, etc., depend on a discrete model of equations that describe the system under study and some operator to extract information from that model. The vast majority of these operators are expressed in linear algebra form, and the dominant sub-operation in these operators is a dot product between two vectors. With the advent of big data and Deep Learning, tensor expressions have become wide-spread, but the basic operations remains squarely the dot product, as can be observed by the key operator that DL accelerators like the Google TPU implement in hardware.
With the advent of large scale Deep Learning applications, all major players have recognized that IEEE Floating Point is the limiting factor for performance, scalability, and power. As a consequence, Google, Microsoft, Facebook, Baidu, etc. all have ditched IEEE floating point in favor of more efficient number systems, and have reaped the benefits against competitors that have stayed with IEEE floating point. This blog post describes the posit number system and its benefits over IEEE floating point for high-performance applications, such as Deep Learning.
The importance of next generation sequencers (NGS) to personalized healthcare opportunities is well documented . Conventional medical diagnosis relies on the analysis of the patient's personal and family history, symptoms, and samples. The goal of personalized health care is to strengthen the diagnostics by including comparisons between the patient's genome and a global reference database of known disease markers. Sample testing will also be enhanced through highly sensitive measurement of adaptive immune system parameters, and molecular-level monitoring of the progression of different cancers. Outside human health, NGS will touch every aspect of our understanding of the living world, and help improve food safety, control pests, reduce pollution, find alternate sources of energy, etc.Read more...
The past 12 months, we have implemented a handful of global cloud platforms that connect US, EU, and APAC. The common impetus behind these projects is to connect brain trusts in these geographies. Whether they are supply chains in Asia program managed from the EU, healthcare cost improvements in the US by using radiologists in India, or high-tech design teams that are collaborating on a new car or smart phone design, all these efforts are trying to implement the IT platform to create the global village.
The teachings provided by these implementations are that cloud computing is more or less a solved problem, but cloud collaboration is far from done. Cloud collaboration from an architecture point of view is similar to the constraints faced by mobile application platforms, so there is no doubt that in the next couple of years we'll see lots of nascent solutions to the fundamental problem of mobility and cloud collaboration: data movement.
The data sets in our US-China project measured in the range from tens to hundreds of TBytes, but data expansion was modest at a couple of GBytes a day. For a medical cloud computing project, the data set was more modest at 35TBytes, but the data expansion of these data sets could be as high as 100GB per day, fueled by high volume instruments, such as MRI or NGS machines. In the US-China collaboration, the problem was network latency and packet loss, whereas in the medical cloud computing project, the problem was how to deal with multi-site high-volume data expansions. The cloud computing aspect of all these projects was literally less than a couple of man weeks worth of work. The cloud collaboration aspect of these projects all required completely new technology developments.
In the next few weeks, I'll describe the different projects, their business requirements, their IT architecture manifestation, and the key technologies that we had to develop to deliver their business value.
The Semantic Web captures the semantics, or meaning, of data, and machines are enabled to interact with that meta data. It is an idea of WWW pioneer Tim Berners-Lee who observed that although search engines index much of the Web's content, keywords can only provide an indirect association to the meaning of the article's content. He foresees a number of ways in which developers and authors can create and use the semantic web to help context-understanding programs to better serve knowledge discovery.
Tim Berners-Lee originally expressed the vision of the Semantic Web as follows:
I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web -- the content, links, and transactions between people and computers. A "Semantic Web", which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The intelligent agents people have touted for ages will finally materialize.
The world of semantic databases just got a little bit more interesting with the announcement by Franz, Inc. and Stillwater Supercomputing, Inc. of having reached a trillion triple semantic data store for telecommunication data.
The database was constructed with an HPC on-demand cloud service and occupied 8 compute servers and 8 storage servers. The compute servers contained dual socket Xeons with 64GB of memory connecting through an QDR IB network to a 300TB SAN. The trillion triple data set spanned roughly 100TB of storage. It took roughly two weeks to load the data, but after that database provided interactive query rates for knowledge discovery and data mining.
The gear on which this result was produced is traditional HPC gear that emphasizes scalability and low latency interconnect. As a comparison, a billion triple version of the database was created on Amazon Web Services but the performance was roughly 3-5x slower. To create a trillion triple semantic database on AWS would have cost $75k and would have taken 6 weeks to complete.
With the explosion of data and the need to make sense out of it all on a smart phone is creating an interesting opportunity. Mobile devices need high performance at low power, and Apple seems to be the only one that has figured out that having your own processor team and IP is actually a key advantage. And the telcos will need Petascale data centers to manage content, knowledge management, and operational intelligence and the performance per Watt of general purpose CPUs from IBM, Intel, and AMD are at least two order of magnitude away from what is possible. So why is there so little innovation in cloud hardware?
The rule of thumb for creating a new chip venture is $50-75M. Clearly the model where your project is just an aggregation of third party IP blocks is not a very interesting investment as it would create no defendable position in the market place. So from a differentiation point of view early stage chip companies need to have some unique IP. And this IP needs to be substantial. This creates the people and tool cost that makes chip design expensive.
Secondly, to differentiate on performance, power, or space you have to be at least close to the leading edge. When Intel is at 32nm, you can’t pick 90nm as a feasible technology to compete. So mask costs are measured in the millions for products that try to compete in high-value silicon.
Thirdly, it takes at least two product cycles to move the value chain. Dell doesn’t move until it can sell 100k units a month, and ISVs don’t move until there millions of units of installed base. So the source of the $50M-$75M needed for fabless semi is that creating new IP is a $20-25M problem if presented to the market as a chip, and it takes two cycles to move the supply chain, and it takes three cycles to move the software.
The market dynamics of IT has created this situation. It used to be the case that the enterprise market drove silicon innovation. However, the enterprise market is now dragging the silicon investment market down. Enterprise hardware and software is no longer the driving force: innovation is now driven by the consumer market. And that game is played and controlled by the high volume OEMs. Secondly, their cost constraints and margins make delivering IP to these OEMs very unattractive: they hold all the cards and attenuate pricing so that continued engineering innovation is hard to sustain for a startup. Secondly, an OEM is not interested in creating unique IP by a third party: it would deleverage them. So you end up getting only the non-differentiable pieces of technology and a race to the bottom.
Personally however, I believe that there is a third wave of silicon innovation brewing. When I calculate the efficiency that Intel gets out of a square millimeter of silicon and compare that to what is possible I see a thousand fold difference. So, there are tremendous innovation possibilities from an efficiency point of view alone. Combining it with the trend to put intelligence into every widget and connecting them wirelessly provides the application space where efficient silicon that delivers high performance per Watt can really shine AND have a large potential market. Mixed-signal and new processor architectures will be the differentiators and the capital markets will at one point recognize the tremendous opportunities present to create a next generation technology that creates these intelligent platforms.
Until then, us folks that are pushing the envelope will continue to refine our technologies so we can be ready when the capital market catches up with the opportunities.
We are building the next generation platform for computational science and engineering. At Stillwater we believe that the 21st century belongs to the computational scientist and that many important innovations will be driven by computational models. We want to aid in that quest with a high productivity environment for machine learning, data mining, bioinformatics, quantitative finance, statistics, and computational science.