“The cloud has enabled us to be more efficient, to try out new experiments at a very low cost, and enabled us to grow the site very dramatically while maintaining a very small team,” said Ryan Park, operations engineer for Pinterest, at the Amazon Web Services Summit, in New York Thursday.
With very little in the way of internal IT infrastructure, Pinterest was able to attract almost 18 million visitors by the month of March, a 50 percent increase from the previous month, according to Web ratings firm ComScore. The site has been one of the fastest growing sites in the history of the Web.
Because of Pinterest’s use of the Amazon Web Services (AWS), “we’ve been able to grow the site so successfully with such a small team,” Park said. As of last December, the social networking service employed only 12 people.
Pinterest’s use of Amazon’s S3 (Simple Storage Service) grew by a factor of 10 since last August. Its use of Amazon’s EC2 (Elastic Cloud Compute) grew by a factor of 3 in that time frame. The company now has about 80 million objects stored in S3, which holds about 410 terabytes of user data.
“Imagine we were running our data center, and we had to go through a process of capacity planning and ordering and racking hardware. It wouldn’t have been possible to scale fast enough,” Park said.
Pinterest is an online pinboard, a service that allows people to collect and organize items of interest, so they can be viewed by others. The company uses a range of AWS services to run the site.
Today, Pinterest runs about 150 EC2 virtual servers, called instances, to run its core Web services, which are written in Python and use the Django framework. Traffic is balanced across these instances using the Amazon ELB (Elastic Load Balancer). “The ELB has a great API, so we can [programmatically] bring in more instances, or take instances out if they are having problems.”
Another 90 EC2 instances are dedicated towards caching, through memcache. “This allows us to keep a lot of data [in memory] that is accessed very often, so we can keep load off of our database system,” Park said. Another 35 instances are used for internal purposes.
Behind the application, Pinterest runs about 70 master databases on EC2, as well as another set of backup databases located in different regions around the world for redundancy.
In order to serve its users in a timely fashion, Pinterest sharded its database tables across multiple servers. When a database server gets more than 50 percent filled, Pinterest engineers move half its contents to another server, a process called sharding. Last November, the company had eight master-slave database pairs. Now it has 64 pairs of databases. “The sharded architecture has let us grow and get the I/O capacity we need,” Park said.
In addition to easy scalability, Amazon has also provided Pinterest with the ability to pay only for the resources it needs, which saves the company money. Most of Pinterest’s traffic happens during the afternoon and evening hours in the U.S. It uses AWS’ autoscaling feature so that more instances are added during the day when traffic is heavy, and excess instances are removed at night.
With this approach, the company is able to reduce the number of servers it uses at night by around 40 percent. Because Amazon charges by the hour, this reduction results in cost savings: During times of peak traffic, Pinterest spends about $52 an hour on EC2, though in the wee hours of the night the company can spend as little as $15 an hour.
Amazon’s pay-as-you-go billing also lets Pinterest test new services without incurring the costs of buying servers or software. “There is no big sales process or big upfront costs when we try something out, so we can try experiments to see what works and what doesn’t.” Park said.
One successful experiment has been its use of Amazon’s Hadoop-based Elastic Map Reduce for data analysis, a service that costs the company only “a few hundred dollars a month,” he said.
During his own keynote talk, Amazon Chief Technology Officer Werner Vogels noted that today’s Web services, like Pinterest, need to have a way to scale very quickly, should they become successful. It would be almost impossible for a Web service, should it grow wildly popular, to scale to the required size in a short period of time by buying and deploying systems in house. “We’re here to help you with that,” he said.
The company has been successful in this task. Overall, AWS now holds about 762 billion customer objects, and gets about 650,000 requests per second. Cloud intelligence firm DeepField Networks estimates that about one percent of all Internet consumer traffic in North America goes to the Amazon cloud.
Not everyone is enamored by the Amazon service. On the same day as the AWS Summit, environmental group Greenpeace hung a banner from the from the future Amazon headquarters in Seattle, chastising Amazon and Microsoft for not using clean energy to power their data centers.