About Twisted Framework (report from HighLoad++in 2009)
as an introduction to asynchronous programming and very superficial story about Twisted Framework publish my report HighLoad++ (2009).
Recently in the field of web, there is a shift of attention from "heavy" application servers that spend on query processing for hundreds of milliseconds or even seconds, to the more lightweight services, transmitting smaller amounts of data with minimal delay. The transition from the generation of tens and hundreds of kilobytes of HTML code in response to the request to transfer the change data Packed in JSON and measured in the hundreds of bytes. As examples of such services can result in Gmail, FriendFeed, Twitter, Live Search, etc.
To ensure minimal latency to the user, you must either maintain a permanent connection (e.g., Adobe Flash, RTMP), or use HTTP long polling technique in conjunction with keep alive. Anyway on the server side, this leads to the emergence of a large number of simultaneous connections (thousands, tens of thousands), each of which passed not so a large amount of data. This situation is called the problem C10k.
For connection processing architectural choice on the server side is not so big: the process connection, thread connection, a combined version of process-threads or asynchronous I / o (possibly in combination with additional processes or threads). If you have more than 10 thousand concurrent connections from the point of view of consumption it is impossible to imagine the creation of 10 thousand processes; 10 thousand threads is also unlikely to be a reasonable solution. You must additionally take into account that the presence of such a large number of compounds amount of work on each of them relatively small, most of them idle waiting for new data. Therefore, most of the processes or threads will just be waiting, spending wasted system resources.
Asynchronous I / o allows blokiruyuschiysya network I / o on thousands of open sockets within a single thread of execution (single process). Mechanisms of implementation in different OS are different, for example: select(), poll(), epoll(), kqueue (), etc. are Examples of applications that use asynchronous I / o:
the
However, asynchronous I / o is not a universal solution: for the database server it is unlikely it would be a good way of maintaining connections, as for the processing of each request requires a large amount of disk I / o and CPU time, which does not allow to do it in one process.
Twisted Framework is an extensive set of classes and modules to implement asynchronous network applications. Twisted Framework is:
the
The report as a concrete example, given three applications implemented on Twisted with my participation. Examines the architecture, specific parametrai performance, optimization techniques, advantages and disadvantages Twisted to solve this problem:
theRTMP server pyFMS, the server broadcasts service Smotri.Com (hundreds of streams, tens of thousands of spectators);
the backend server-project MDC – the storage and handling history of communication of users, storage of settings, etc.;
the Qik Push Engine server, instant delivery of change information about a video created by users of the service, including push notification about a new live-streams, scaling and processing large amounts of information.
Additional information:
the
the
the
Article based on information from habrahabr.ru
Recently in the field of web, there is a shift of attention from "heavy" application servers that spend on query processing for hundreds of milliseconds or even seconds, to the more lightweight services, transmitting smaller amounts of data with minimal delay. The transition from the generation of tens and hundreds of kilobytes of HTML code in response to the request to transfer the change data Packed in JSON and measured in the hundreds of bytes. As examples of such services can result in Gmail, FriendFeed, Twitter, Live Search, etc.
To ensure minimal latency to the user, you must either maintain a permanent connection (e.g., Adobe Flash, RTMP), or use HTTP long polling technique in conjunction with keep alive. Anyway on the server side, this leads to the emergence of a large number of simultaneous connections (thousands, tens of thousands), each of which passed not so a large amount of data. This situation is called the problem C10k.
For connection processing architectural choice on the server side is not so big: the process connection, thread connection, a combined version of process-threads or asynchronous I / o (possibly in combination with additional processes or threads). If you have more than 10 thousand concurrent connections from the point of view of consumption it is impossible to imagine the creation of 10 thousand processes; 10 thousand threads is also unlikely to be a reasonable solution. You must additionally take into account that the presence of such a large number of compounds amount of work on each of them relatively small, most of them idle waiting for new data. Therefore, most of the processes or threads will just be waiting, spending wasted system resources.
Asynchronous I / o allows blokiruyuschiysya network I / o on thousands of open sockets within a single thread of execution (single process). Mechanisms of implementation in different OS are different, for example: select(), poll(), epoll(), kqueue (), etc. are Examples of applications that use asynchronous I / o:
the
-
the
- nginx (use additional processes for maintenance tasks, require more amount of CPU); the
- haproxy; the
- memcached the
- and more.
However, asynchronous I / o is not a universal solution: for the database server it is unlikely it would be a good way of maintaining connections, as for the processing of each request requires a large amount of disk I / o and CPU time, which does not allow to do it in one process.
Twisted Framework is an extensive set of classes and modules to implement asynchronous network applications. Twisted Framework is:
the
-
the
- kernel that abstracts all operations asynchronous I / o and use the appropriate mechanism-specific OS; the
- the concept of Deferred, which allows to realize in a simple form service request: asynchronous network applications (e.g., database, memcached), handling errors; Deferred is analogous to conventional designs serial programming asynchronous programming model; the
- an extensive set of already implemented network protocols: HTTP, DNS, SMTP, IMAP, memcached, Jabber, ICQ, etc.; more protocols available in the form of additional modules; the
- advanced infrastructure unit testы enabled Deferred, thread pools, processes, etc.; the
- high-quality concept development — full coverage of unit-testами, strict review of any change.
The report as a concrete example, given three applications implemented on Twisted with my participation. Examines the architecture, specific parametrai performance, optimization techniques, advantages and disadvantages Twisted to solve this problem:
the
Additional information:
the
-
the
- Twisted Documentation the
- This blog on habré the
- Alexander Burtsev about Twisted the
- Deferred in Twisted and .
the
Presentation
the
-
the
- Download (PDF) the
- View on SlideShare
Comments
Post a Comment