Microsoft has been using its internal big-data service, Cosmos, for years, and now it’s looking to take the service public. Cosmos is behind most of Microsoft’s consumer services, including Azure, Bing, AdCenter, MSN, Skype, and more, and there are thousands of users and 5,000 customers using the service. This comes according to a post by Mary Jo Foley, quoting her usual sources and some Microsoft Careers job postings, which have apparently since been removed.
Cosmos uses Dryad, a Microsoft Research inspired project allowing developers to use the resources of a data center to run data-crunching programs in parallel, making them efficient for large scale efforts. It’s used within Microsoft, according to Foley, to:
… process telemetry data; to perform analysis and reporting on large datasets, such as those created via Bing and Office 365; and to curate and perform back-end processing on many kinds of data. A lot of the data used for these various purposes is shared. Queries on this data can run on anywhere from one to 40,000 machines in parallel.
Microsoft currently offers HDInsight, which uses the open source Hadoop to power big data on Azure, and Cosmos will complement that effort, allowing programmers to choose between HDInsight and using SQL-IP, possibly from a Visual Studio plug-in, to query data using Cosmos.
Microsoft has been running big data operations for years, beginning with services like Hotmail, and continuing on with Bing and Azure, among others. Their expertise in powering these large data sets, and the tools they use to manage them, could become Microsoft’s next hit in services-for-pay.