highload.html 7.7 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146
  1. <h3>Optimizing 3proxy for high load</h3>
  2. <p>Precaution 1: 3proxy was not initially developed for high load and is positioned as a SOHO product, the main reason is "one connection - one thread" model 3proxy uses. 3proxy is known to work with above 200,000 connections under proper configuration, but use it in production environment under high loads at your own risk and do not expect too much.
  3. <p>Precaution 2: This documentation is incomplete and is not sufficient. High loads may require very specific system tuning including, but not limited to specific or cusomized kernels, builds, settings, sysctls, options, etc. All this is not covered by this documentation.
  4. <h4>Configuring 'maxconn'</h4>
  5. A number of simulatineous connections per service is limited by 'maxconn' option.
  6. Default maxconn value since 3proxy 0.8 is 500. You may want to set 'maxconn'
  7. to higher value. Under this configuration:
  8. <pre>
  9. maxconn 1000
  10. proxy -p3129
  11. proxy -p3128
  12. socks
  13. </pre>
  14. maxconn for every service is 1000, and there are 3 services running
  15. (2 proxy and 1 socks), so, for all services there can be up to 3000
  16. simulatineous connections to 3proxy.
  17. <p>Avoid setting 'maxconn' to arbitrary high value, it should be carefully
  18. choosen to protect system and proxy from resources exhaution. Setting maxconn
  19. above resources available can lead to denial of service conditions.
  20. <h4>Understanding resources requirements</h4>
  21. Each running service require:
  22. <ul>
  23. <li>1*thread (process)
  24. <li>1*socket (file descriptor)
  25. <li>1 stack memory segment + some heap memory, ~64K-128K depending on the system
  26. </ul>
  27. Each connected client require:
  28. <ul>
  29. <li>1*thread (process)
  30. <li>2*socket (file descriptor). For FTP 4 sockets are required.
  31. <br>1 additional socket (file descriptor) during name resolution for non-cached names
  32. <br>1 additional socket during authentication or logging for RADIUS authentication or logging.
  33. <li>1*ephemeral port (3*ephemeral ports for FTP connection).
  34. <li>1 stack memory segment of ~32K-128K depending on the system + at least 16K and up to few MB (for 'proxy' and 'ftppr') of heap memory. If you are short of memory, prefer 'socks' to 'proxy' and 'ftppr'.
  35. </ul>
  36. Also, additional resources like system buffers are required for network activity.
  37. <h4>Setting ulimits</h4>
  38. Hard and soft ulimits must be set above calculated requirements. Validate
  39. ulimits match your expectation, especially if you run 3proxy under dedicated account
  40. by adding e.g.
  41. <pre>
  42. system "ulimit -Ha >>/tmp/3proxy.ulim.hard"
  43. system "ulimit -Sa >>/tmp/3proxy.ulim.soft"
  44. </pre>
  45. in the beginning (before first service started) and the end of config file.
  46. Make both hard restart (that is kill and start 3proxy process) and soft restart
  47. by sending SIGUSR1 to 3proxy process, check ulimits recorded to files match your
  48. expecation.
  49. <h4>Extending system limitation</h4>
  50. Check manuals / documentation for your system limitations. You may need to change
  51. sysctls or even rebuild the kernel from source.
  52. To help with system-dependant settings, 3proxy supports different socket options
  53. which can be set via -ol option for listening socket, -oc for proxy-to-client
  54. socket and -os for proxy-to-server socket. Example:
  55. <pre>
  56. proxy -olSO_REUSEADDR,SO_REUSEPORT -ocTCP_TIMESTAMPS,TCP_NODELAY -osTCP_NODELAY
  57. </pre>
  58. available options are system dependant.
  59. <h4>Extending ephemeral port range</h4>
  60. Check ephemeral port range for your system and extend it to reuired number of ports.
  61. Ephimeral range is always limited to maximum number of ports (64K). To extend
  62. outgoing connections above this limis, extending ephemeral port range is not enough,
  63. you need additional actions:
  64. <ol>
  65. <li> Configure multiple outgoing IPs
  66. <li> Make sure 3proxy is configured to use different outgoing IP by either using
  67. multiple services with different external interfaces or via "parent extip" rotation.
  68. <li> You may need additional system dependant actions to use same port on different IPs,
  69. usually by adding SO_REUSEADDR socket option to external socket. This option can be
  70. set (since 0.9 devel) with -osSO_REUSEADDR option:
  71. <pre>
  72. proxy -p3128 -e1.2.3.4 -osSO_REUSEADDR
  73. </pre>
  74. </ol>
  75. <h4>Setting stacksize</h4>
  76. 'stacksize' is a size added to all stack allocations and can be both positive and
  77. negative. Stack is required in functions call. 3proxy itself doesn't require large
  78. stack, but it can be required if some
  79. purely-written libc, 3rd party libraries or system functions called. There is known\
  80. dirty code in Unix ODBC
  81. implementations, build-in DNS resolvers, especially in the case of IPv6 and large
  82. number of interfaces. Under most 64-bit system extending stacksize will lead
  83. to additional memory space usage, but do not require actual commited memory,
  84. so you can inrease stacksize to relatively large value (e.g. 1024000) without
  85. the need to add additional phisical memory,
  86. but it's system/libc dependant and requires additional testing under your
  87. installation. Don't forget about memory related ulimts.
  88. <p>For 32-bit systems address space can be a bottlneck you should consider. If
  89. you're short of address space you can try to use negative stack size.
  90. <h4>Known system issues</h4>
  91. There are known race condition issues in Linux / glibc resolver. The probability
  92. of race condition arises under configuration with IPv6, large number of interfaces
  93. or IP addresses or resolvers configured. In this case, install local recursor and
  94. use 3proxy built-in resolver (nserver / nscache / nscache6).
  95. <h4>Avoid large lists</h4>
  96. Currently, 3proxy is not optimized to use large ACLs, user lists, etc. All lists
  97. are processed lineary. In devel version you can use RADIUS authentication to avoid
  98. user lists and ACLs in 3proxy itself. Also, RADIUS allows to easily set outgoing IP
  99. on per-user basis or more sophisicated logics.
  100. RADIUS is a new beta feature, test it before using in production.
  101. <h4>Avoid changing configuration too often</h4>
  102. Every configuration reload requires additional resources. Do not do frequent
  103. changes, like users addition/deletaion via connfiguration, use alternative
  104. authentication methods instead, like RADIUS.
  105. <h4>Do not monitor configuration files directly</h4>
  106. Using configuration file directly in 'monitor' can lead to race condition where
  107. configuration is reloaded while file is being written.
  108. To avoid race conditions:
  109. <ol>
  110. <li> Update config files only if there is no lock file
  111. <li> Create lock file then 3proxy configuration is updated, e.g. with
  112. "touch /some/path/3proxy/3proxy.lck"
  113. <li>add
  114. <pre>
  115. system "rm /some/path/3proxy/3proxy.lck"
  116. </pre>
  117. at the end of config file to remove it after configuration is successfully loaded
  118. <li> Use a dedicated version file to monitor, e.g.
  119. <pre>
  120. monitor "/some/path/3proxy/3proxy.ver"
  121. </pre>
  122. <li> After config is updated, change version file for 3proxy to reload configuration,
  123. e.g. with "touch /some/path/3proxy/3proxy.ver".
  124. </ol>
  125. <h4>Use TCP_NODELAY to speed-up connections with small amount of data</h4>
  126. If most requests require exchange with a small amount of data in a both ways
  127. without the need for bandwidth, e.g. messengers or small web request,
  128. you can eliminate Nagle's algorithm delay with TCP_NODELAY flag. Usage example:
  129. <pre>
  130. proxy -osTCP_NODELAY -ocTCP_NODELAY
  131. </pre>
  132. sets TCP_NODELAY for client (oc) and server (os) connections.
  133. <pre>
  134. </pre>
  135. <h4>Use slice to speedup large data amount transfers</h4>
  136. slice() allows to copy data between connections without copying to process
  137. addres space. It can speedup proxy on high bandwidth connections, if most
  138. connections require large data transfers. "-s" allows slice usage. Example:
  139. <pre>
  140. proxy -s
  141. </pre>
  142. Slice is only available in Linux and is currently beta option available in
  143. devel version. Do not use it in production without testing. Slice requires
  144. more system buffers, but reduces process memory usage.
  145. Do not use slice if there is a lot of short-living connections with no bandwidth
  146. requirements.