Wikipedia:Do worry about performance

From Wikipedia, the free encyclopedia

This essay, Do worry about performance, refers to improving the overall understanding of article display speed, and storage, and giving helpful ideas to ensure the best operation for everyone's needs. Wikipedia is a relatively complicated system, very difficult for any one person to fully understand how to improve, but by people working together, the system can be more responsive for everyone.

The title of this essay is derived from the old, early essay "WP:Don't worry about performance" that was written years before there were 4 million articles and hundreds of thousands of templates. Originally, it was believed (or imagined by some) that articles would display fast regardless of contents, and later it was hoped that might be true, but finally, the reality was seen how numerous users, ignoring performance for years, had created gargantuan articles that could not be displayed in less than 30 seconds. That seemed to be the end of instant access to the "sum of all human knowledge".

Eventually, many techniques were found to make articles much faster, to display within 9 seconds, and to begin to discuss performance issues in a realistic manner. People learned to avoid some settings in Special:Preferences, which bypassed the quick cache-page feature, such as thumbnail-size set lower/higher than 220px (in 2011).

Why performance mattered and still does[edit]

In the early years, Wikipedia was viewed as the actions of individual users, each editing various articles, with occasional troublemakers who would be spotted and stopped from creating any performance bottlenecks in viewing Wikipedia articles and project pages. However, the major problems were not with the occasional troublemakers, but rather, with many thousands of users acting without concerns about performance. What had remained unseen was the broader picture, where some people initially created modest templates, or infoboxes, to append into several pages, and then later, someone expanded those templates with more features, while other people appended the templates into more articles. After a year or so, people expanded the features to have sub-features in many templates. Then intense users started putting templates into every article they could find, or ran bots (or edit-tools) to re-edit every article and force template usage. After about 2 more years, the various templates with sub-features were expanded to have cross-features to handle exotic cases rarely used, but still checked in every article, during every edit-preview and whenever the article was reformatted for any small change. The templates became victims of "creeping featurism" and were not redesigned to avoid the combined, convoluted features. In some cases, complicated (slow) templates were used as a means to force a style into thousands of articles (style-pushing). Every time the style was adjusted, slightly, thousands of articles had to be reformatted to match, even if the changed style was not used in many articles. If only people had been thinking about performance, then they could have redesigned the templates and articles, earlier, to avoid rare options, allow flexible styles, or to only run tedious checks for options under the related rare conditions. Bottom wp:navboxes were appended into many thousands of articles, many only vaguely related to the article's topic. Many articles doubled in size to become half-filled with 10 or more navboxes.

Finally, Wikipedia became bizarrely slow. Although, in the early years, all articles displayed within 1 second, eventually, hundreds of major articles were delayed by 15-20 seconds before display to logged-in users with special preferences (such as image size). Even some of the most-viewed articles would require a staggering 30-second delay, for template-based formatting, before the initial display. Those were articles viewed many thousands of times per day, and many logged-in users had to wait the full 30-second delay to view the article. The situation became worse: most people did nothing for, or fought against, improving the 30-second delays, which continued in several cases for over 2 years.

Templates were optimized[edit]

Beginning in 2009, the performance crisis was well underway, and even by 2010, the old essay ("Don't worry about perf..") was modified to be less cocksure that performance would not suffer if totally ignored. The critical importance of performance analysis was being understood, and numerous debates were held to find faster methods to accomplish similar results, or to question the whole basis of having such complexity in pages. Various people each tried different techniques, and the word spread throughout Wikipedia that much faster methods could be used. Often, templates could be made 10x, 20x, or 30x times faster, and articles could easily be displayed twice as fast during edit-preview. In many cases, the templates were not needed, such as forcing the inclusion of hidden "collapsed" tables which, instead, should have been linked externally from an article, to be displayed separately. Each large article had begun to resemble a mini-newspaper, with the leading textual content up front, followed by long bottom sections of collapsified ads ("classified ads") in the form of several navboxes tacked onto an article as info-adverts, to offer thousands of other articles to consider as the next link ("read me, read me"), after reading the current page.

See also[edit]

[ This essay is a quick draft to be expanded later. ]