Computer Scientists Find a Key Research Algorithm's Limits
source link: https://www.wired.com/story/computer-scientists-find-a-key-research-algorithms-limits/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Computer Scientists Find a Key Research Algorithm's Limits
Many aspects of modern applied research rely on a crucial algorithm called gradient descent. This is a procedure generally used for finding the largest or smallest values of a particular mathematical function—a process known as optimizing the function. It can be used to calculate anything from the most profitable way to manufacture a product to the best way to assign shifts to workers.
Yet despite this widespread usefulness, researchers have never fully understood which situations the algorithm struggles with most. Now, new work explains it, establishing that gradient descent, at heart, tackles a fundamentally difficult computational problem. The new result places limits on the type of performance researchers can expect from the technique in particular applications.
“There is a kind of worst-case hardness to it that is worth knowing about,” said Paul Goldberg of the University of Oxford, coauthor of the work along with John Fearnley and Rahul Savani of the University of Liverpool and Alexandros Hollender of Oxford. The result received a Best Paper Award in June at the annual Symposium on Theory of Computing.
You can imagine a function as a landscape, where the elevation of the land is equal to the value of the function (the “profit”) at that particular spot. Gradient descent searches for the function’s local minimum by looking for the direction of steepest ascent at a given location and searching downhill away from it. The slope of the landscape is called the gradient, hence the name gradient descent.
Gradient descent is an essential tool of modern applied research, but there are many common problems for which it does not work well. But before this research, there was no comprehensive understanding of exactly what makes gradient descent struggle and when—questions another area of computer science known as computational complexity theory helped to answer.
“A lot of the work in gradient descent was not talking with complexity theory,” said Costis Daskalakis of the Massachusetts Institute of Technology.
Computational complexity is the study of the resources, often computation time, required to solve or verify the solutions to different computing problems. Researchers sort problems into different classes, with all problems in the same class sharing some fundamental computational characteristics.
To take an example—one that’s relevant to the new paper—imagine a town where there are more people than houses and everyone lives in a house. You’re given a phone book with the names and addresses of everyone in town, and you’re asked to find two people who live in the same house. You know you can find an answer, because there are more people than houses, but it may take some looking (especially if they don’t share a last name).
This question belongs to a complexity class called TFNP, short for “total function nondeterministic polynomial.” It is the collection of all computational problems that are guaranteed to have solutions and whose solutions can be checked for correctness quickly. The researchers focused on the intersection of two subsets of problems within TFNP.
The first subset is called PLS (polynomial local search). This is a collection of problems that involve finding the minimum or maximum value of a function in a particular region. These problems are guaranteed to have answers that can be found through relatively straightforward reasoning.
One problem that falls into the PLS category is the task of planning a route that allows you to visit some fixed number of cities with the shortest travel distance possible given that you can only ever change the trip by switching the order of any pair of consecutive cities in the tour. It’s easy to calculate the length of any proposed route and, with a limit on the ways you can tweak the itinerary, it’s easy to see which changes shorten the trip. You’re guaranteed to eventually find a route you can’t improve with an acceptable move—a local minimum.
The second subset of problems is PPAD (polynomial parity arguments on directed graphs). These problems have solutions that emerge from a more complicated process called Brouwer’s fixed point theorem. The theorem says that for any continuous function, there is guaranteed to be one point that the function leaves unchanged—a fixed point, as it’s known. This is true in daily life. If you stir a glass of water, the theorem guarantees that there absolutely must be one particle of water that will end up in the same place it started from.
Recommend
-
58
The phenomenal success of our integrated circuits managed to obscure an awkward fact: they're not always the best way to solve problems. The features of modern computers—binary operations, separated processing and memory,...
-
10
README.md 来源:公众号|3D视觉工坊
-
5
From AI to Teamwork: 7 Key Skills for Data Scientists Today's data scientists need more than proficiency in AI and Python. Organizations are looking for specialists who also feel at home in the C-suite. Credit: Maksym Yemely...
-
4
Skoltech scientists use supercomputer to probe limits of Google’s quantum processor 21/09/2021 CPQM’s Laboratory for Quantum Information Processing has collaborated with the CDISE supercomputing team “Zhores” to...
-
4
MIT's newest computer vision algorithm identifies images down to the pixelThe STEGO system's higher fidelity could give AI a more accurate view of the world. seamartini via Getty ImagesFor humans, identifying it...
-
6
Scan-Line Polygon Area Fill AlgorithmScan-Line Polygon Area Fill Algorithm | Computer GraphicsToday we will see another Topic in the subject computer Graphics. That is...
-
4
On the edge — Quantum computer succeeds where a classical algorithm fails Quantum computers coupled with traditional machine learning show clear benefits.
-
3
Computer Science Proof Unveils Unexpected Form of Entanglement
-
0
Disable Weak Key Exchange Algorithm, CBC Mode in SSH 原创 wemux 2022-08-1...
-
1
January 30, 2023 ...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK