Ask HN: Do you use an optimization solver? Which one? Do you like it?
source link: https://news.ycombinator.com/item?id=31099186
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Ask HN: Do you use an optimization solver? Which one? Do you like it?
Ask HN: Do you use an optimization solver? Which one? Do you like it? 209 points by ryan-nextmv 16 hours ago | hide | past | favorite | 95 comments We’re trying to bring new ideas to the optimization space. We’ve used a bunch of technologies over time (CP, MIP, LP, heuristics, etc.). We’ve built our own solver (for https://www.nextmv.io) and learned a lot along the way.
Solvers are amazing. MIP solvers, for example, are some of the fastest and most mature software packages that exist. They’re also wickedly challenging to use effectively.
Do you use optimization solvers? Which ones? Why? Do you like them? Why or why not?
Initially, CPLEX and Xpress were founded in the eighties. In the nineties, CPLEX was acquired by ILOG (a French CP company), which in turn was purchased by IBM around 2009. Around the same time, the original technical co-founder of CPLEX, along with the two latest head developers, left CPLEX to found Gurobi. Since then, there has been a slow trickle of developers leaving CPLEX for Gurobi... until 2020, when CPLEX suddenly lost its 7 remaining devs over 6 months (because of catastrophic mismanagement at IBM, from what I heard). Unsurprisingly, those devs ended up mostly at Gurobi, resulting in the CPLEX team from 20 years ago being essentially Gurobi now. Other CPLEX devs also ended up at XPRESS, which had been purchased around 2008 by FICO (the credit rating company).
Meanwhile, there is also a smaller Danish company, Mosek, that does its own thing (they have a MIP solver, but their focus seems to be on their amazing conic optimization code). And SAS (the analytics giant) has a small MIP team too.
Then over the last 2 years, three new solvers appeared out of China: COPT (by Cardinal Operations, a startup by Stanford graduates), MindOpt (Alibaba Research) and OptVerse (Huawei). They mostly have LP solvers for now, but for newcomers, the performance is extremely impressive. This is only partially out of nowhere, though: COPT in particular has hired several devs from the incumbents.
On the academic side, ZIB (a PhD-granting research institute in Berlin) maintains a source-available family of solvers, and has been a steady provider of talent for commercial solvers. The dev behind SoPlex (LP solver) went to CPLEX after his PhD and now Gurobi. The main dev behind SCIP did the same, and is now VP of R&D at Gurobi. Many more XPRESS and Gurobi people did their PhDs at ZIB.
The Coin-OR open-source codes for LP (clp) and MIP (cbc) were written decades ago by a founding father of computational optimization, John Forrest, now retired from IBM research. Their source code is difficult to read, and Coin-OR aims to eventually replace them with a new code, HiGHS. The dev who wrote the simplex code of HiGHS as his PhD thesis went on to XPRESS and now COPT. The dev who writes the MIP code of HiGHS comes from ZIB.
As you can see, everyone is very inter-connected. Hence the throwaway :-).
1. At least they exist! They give a good rough idea of the relative performance of the various solvers. Also, I think it provides a sense of competition that is great for innovation. Some of the newcomers push press releases about having "won a prize" when they leapfrog the competition in one of the (sometimes weekly) benchmarks.
2. For the MIP benchmarks [2], a lot of thought has gone into doing them properly [3].
Don't be scared by the "End of a benchmarking era" banner. In 2018, Gurobi cherry-picked numbers that made them look good in some promotional material, while suggesting endorsement by Hans Mittelmann. There was a whole fuss about it and they had to publicly apologize [4]. This was a very dumb move since they were the fastest solver anyways, just not by as much as they claimed. CPLEX and Xpress took this as an excuse to take their toys and go home (they have since refused to participate in benchmarking). Gurobi was banned from the benchmarks for a while, but now they're back.
[1] http://plato.asu.edu/bench.html
[2] http://plato.asu.edu/ftp/milp.html
[3] https://link.springer.com/article/10.1007/s12532-020-00194-3
[4] https://web.archive.org/web/20181224083542/http://www.gurobi...
* Minion for IP for scheduling + edge crossing minimization.
* cvxpy (wrapping ECOS, OSQP, and SCS by default) for convex optimization for making nice geometry.
* Z3 and STP for SAT/SMT for program analysis.
All are FLOSS, which is my main criterion in many situations.
Beyond that, I like minion for its focus on only providing efficiently implementable primitives, cvxpy for the awesome lectures and documentation the folks behind it have produced, and z3 + stp for their use in tools I like, such as klee.
Most open-source solvers don't handle parallelization well, and they lack the latest research on techniques like branch-cutting and heuristics that can speed things up significantly.
In my experience, Gurobi is still leader for linear and MiP solving. But, it's really expensive and the licensing terms seem anachronistic.
SimpleRose.com was a startup working on a new solver, too - I'm curious if anybody has tried it yet?
To OP's question, besides JuMP, the other use case I've had for optimization is for optimizing the drawing order of pen plots, for which I used or-tools. A write-up on it is here: https://nb.paulbutler.org/optimizing-plots-with-tsp-solver/
I remember liking JuMP, but Julia itself didn't feel ready yet. Some of the packages had weird behaviors. For example, Gadfly took several minutes to render some of my charts. IIRC when I looked at the source, it was solving a MIP to compute the best bounding box for display purposes.
I should probably check it out again.
Also: agreed regarding Gurobi licensing.
If I'm solving an optimization problem for personal curiosity, a blog post, etc., JuMP is the tool I reach for. :)
Our go-to is LINDO, mainly because it is powerful and it can be integrated into an Excel model which will let us share models with clients.
It was much easier to build a score-calculator for min/max based on user-tweaked parameters, then, jiggle the data, re-calculate score, and keep doing it until there was significant improvement (again, standard SA). I would absolutely love to convert the entire production plan logic with material availability, lead-times, customer demand, quality control windows etc. to something like nextmv.io but looking at your docs, I have no idea where to begin.
Cost is a big factor too because 3 years ago I bought 4 old 24-core Xeons off eBay and they've been chugging non-stop simulating billion+ flops per hour with electricity being the only cost. I don't mind paying $50-100/day for cloud if the results are great and code is easy to manage. We have the same chicken-egg problem everyone in supply chain currently has - we don't have enough materials to make everything, don't know when we'll get everything, and so don't know the best order to buy/make everything in. I would love to write a solver for this using our dataset but I kind of don't want to re-invent the wheel.
As it stands, every solver I find is one layer of abstraction away from what I want to code in. I can explain the problem in length if you want but it's honestly nothing unique - just the standard MRP/ERP planning with ton of BOM items, PO delays, labor/machine capacity constraints etc.
Your existing tutorials explain how I can use your API/SDK to perform OR operations. That's great and a necessity. However, it's not sufficient for me because my questions are: How do I represent my production calendar in the JSON blob for your algo? How do I put a constraint of 100hrs/week/machine but also 168hrs/week/room full of specific machines. In other words, while each machine can run 100hrs/week, if there are 4 of them in the same room, only one can run at a time, and so combined machines in a given room cannot be over 168hrs/week. Maybe a tutorial or a higher-level library to help people like me convert business rules into JSON format for your APIs. Because even if I might be capable of using your API as-is, I unfortunately don't have the time to implement these things myself. Hope this makes sense and gives you some insight into at least one of your target use-cases.
We haven't put a lot more examples into our SDK docs lately since we've been working more on our routing engine and cloud offering. Now we're getting back to more foundational questions like "what should our solver be?" and "how should model formulation work?"
Hop started off as a decision diagram solver, but even internally we strap a lot of different techniques onto it. My hope is to support those patterns better, which is really why I posed this question.
I'd be interested to know: what made the process of converting business logic into solver-speak painful?
The fact that every solver doc I found shows me two near identical variables and 4 identical constraints. I can solve that using Excel Add-In. My problem is mind-numbingly complex, like everyone else doing supply-chain optimization. So I need examples for each type of constraint I have but instead, the tools expect me to figure out everything based on very generic examples. Dates are a big deal in what I do. I have no idea how to convert "each day the customer order is delayed is bad, but I also can't make things 3 months before they want it" into solver-speak. Cleaning a machine takes time, so it is often better to make similar things in a row (campaign-mode). No idea how to convert that to solver-speak.
The good thing is that once I understand how to convert my data into solver-speak, I can keep it updated on the fly since business logic doesn't change daily, just the data-set.
Feel free to contact me directly and/or we can Zoom etc. if you wish to discuss further. Wish you the best of luck either way!
What I saw often, and this was in finance, that one or two developers gain proficiency in converting problems to a specific solver tool, and then we could never move to a different library because it would have been so much effort to relearn.
if your problems can be attacked by LP / MIP type stuff, there's a book "model building in mathematical programming" by Williams that has a couple dozen different optimisation problems for a range of applications, with worked examples of how to formulate a model to solve each of them. each problem will be much less complex than your real-world optimisation puzzle but if you browse through the problem descriptions you might be able to match some against parts of your situation & get ideas for how Williams would model and optimise it.
It really feels like a tools or language problem. Heck, we used to have to manually work out derivatives for continuous optimization problems, but nowadays programming languages with performant built-in autodiff often make this trivial. Removing the manual derivation hassle let loose a flood of cool ideas and applications, even though there was no technical hurdle preventing them in the first place.
Alternate problem specifications is a well-explored area (what is Prolog if not a way of describing problems for a constraint satisfier?), but I wonder how many other neat things are dammed up behind usability problems.
I have a niggling feeling somewhere that JSON is the wrong language for this. Something like cue lang may be the right thing. (https://www.sobyte.net/post/2022-04/cue/)
I think there are much better algorithms in metaheuristic search than just simulated annealing.
* CLP/CBC: open source makes deployment and devops easy, which is great. Linear models are nice in that you "know what you're getting." Performance is at times a pain point.
* Gurobi: super fast, but the licensing was just impossible. Partly that was due to high cost, but ultimately we could have borne the cost; the inability to do something like have autoscaling containers using Gurobi was ultimately the dealbreaker for us. As Zoba grew, we had to turn to alternatives.
* NLOpt: absolutely a blessing, but the variety of algorithms and tuning parameters is really opaque, even for a team with some OR experience/background.
* OR-tools: powerful but the documentation is remarkably terrible, and the vehicle routing problem solver doesn't natively support all the constraints we'd like, so we have to do some hacks.
Overall my feeling for all these tools is roughly gratitude: solvers are complex, and rolling our own would be absolutely impractical at our size. But also there's some pain in between "I've formulated my problem as a nice mathematical model" and "I can reliably get optimal results on time."
NextMV was practically the opposite (at the time, I'm sure it has improved now, especially since they used to be far more insistent on decision diagrams): rather bad/terrible performance, but excellent in terms of licensing and deploying the code, and they had great support too. The modern deployment made sense given they were a new/modern company. One silver lining to the terrible performance, and why I used to stick up for them at the time was that you could get somewhat acceptable results fast: if you stopped after one second, CBC might be absolutely nowhere, but NextMV's solver would at least give you something. This meant you could do things that made use of extremely fast results, like trying a configuration and checking the (approximate) solution, then trying a bunch more, all very quickly.
In the end I mostly settled on OR-Tools.
Gurobi, AFAIK, is considered the leading edge of the field, and other tools, especially open-source ones, do not perform as well.
I cannot speak specifically for your firm and use cases, but in my experience, if the use cases are structured and specific enough, rolling out a in-house optimizer is not such a bad idea.
At one time, OR looked good to me, and I bet a lot of my career on it: Off and on I heard about various parts of optimization. Then at FedEx, in a rush I wrote some simple software that produced nice fleet schedules and literally saved the company. Smith's remark "solved the most important problem ...".
But since airplanes are staggeringly expensive, I considered optimization. That looked like 0-1 integer linear programming with mostly just one row for each city to be served and one column for each candidate single airplane tour from Memphis and back. In generating the candidate tours, could easily handle lots of complicated costing and really goofy constraints. The optimization step itself was just set covering.
But my wife was still in Maryland; the promised stock was 18+ months late; I went and got an applied math Ph.D. with a lot of emphasis on optimization -- dissertation in stochastic optimal control. I taught linear programming in a business school (trying to help my wife recover from the stress of her Ph.D. -- she never recovered and her body was found floating in a lake; the two Ph.D. degrees were very costly).
Then, after sending 1000+ resumes, net, I had to conclude: Nearly no one in business gives as much as even a small fart about optimization or operations research.
I did stumble onto some small projects: One of these was a marketing resource allocation problem, 0-1 integer linear programming, 600,000 variables, 40,000 constraints. I used the IBM OSL (Optimization Subroutine Library), called from some Fortran code I wrote, and used some Lagrangian relaxation -- in 900 seconds of computing, with as I recall 500 primal-dual iterations, got a feasible solution, from the Lagrange multiplier bounding, within 0.025% of optimality. The customer was not much interested: The CEO had a buddy, his CTO, and my success had apparently embarrassed the CTO and, thus, displeased the CEO. I never talked with the CTO after my success, but his belief was that his simulated annealing, running for days, would be the best approach. So, he was torqued at my success.
Had a similar situation in another problem.
I had to conclude: Optimization seems like an obviously valuable step, tool, and often quite worthwhile. E.g., for some $100 million project, optmization might save 10%, $10 million.
There were some Nobel prizes based on optimization.
There have been lots of successful, valuable, optimization projects reported.
So, there should be plenty of people in business eager to pursue optimization? Right? I assumed so. I was wrong.
But the guys who have the money and the responsibility and make the decisions very much don't trust or like operations research or optimization. They see such a project as a lose-lose situation: If the project fails, and some will, or just does not do very well, then their career takes a hit. If the project is nicely successful, then that success is a force in the organization that the C-level (CEO, CTO, CFO, etc.) can't really control. The C-level doesn't like to be out of control. Instead, they want to do things their way, and that does not include operations research.
So, I gave up on operations research and optimization.
So, instead of trying to have a career, doing optimization, working for a salary for people who don't want optimization, I'm doing a start-up. There is some pure/applied old/original math in the core of it, but no users will suspect anything mathematical.
SciPy.optimize: Similar pros/cons as Matlab other than open-source.
CVXPY (& its default backends): Modeling languages are great. First thing I will try for a new convex problem.
CVXGEN: Amazing, but infuriating that it can only be used through a web app.
PyTorch: Only supports unconstrained first-order methods. Automatic differentiation of arbitrarily complex functions is huge. Somebody should implement interior-point and SQP on the GPU for PyTorch.
As a researcher, my first impression is that your product is designed for people who want to deploy optimization in some service or business process, not for me.
I needed shadow prices defined using the master problem dual solution. In my problem instances I would very often run into scenarios where the primal (and hence also dual) master LP problem had a unique objective value but the dual solutions at which that maxima was attained were non-unique. I learned that it only makes sense to talk about shadow prices in some allowable range for each dual decision variable, but LP solvers generally don't give you an API to extract much information about this from the current internal state of the simplex algorithm [0]. I read a bunch of LP solver documentation and the best one I found discussing this kind of stuff was the section in MOSEK's manual about sensitivity analysis[1]. This was for a hobby project, not business, so I didn't try out MOSEK, even though it is looks very modestly priced compared to other commercial packages.
What I did find, however, was that some time during the last few years, scipy.optimize grew an LP solver interface, and even better, Matt Haberland contributed a python implementation of an interior-point LP solver straight into scipy [2][3]. I found that Haberland & co's open source interior point LP solver produced dual solutions that tended to be more stable and more useful for shadow prices in my problems than a bunch of the other open source LP solvers I tried (including the various other LP backends exposed by scipy.optimize.linprog).
[0] a paper I found very helpful in understand what was going on was Jansen, de Jong, Roos, Terlaky (1997) "Sensitivity analysis in linear programming: just be careful!". In 1997 they got 5 commercial LP solvers to perform a sensitivity analysis on an illustrative toy problem, and although the solvers all agreed on the optimal objective value, none of them agreed with each other on the sensitivity analysis.
[1] https://docs.mosek.com/9.2/pythonapi/sensitivity-shared.html
[2] https://github.com/scipy/scipy/pull/7123
[3] https://docs.scipy.org/doc/scipy/reference/generated/scipy.o...
I use MATLAB/Octave, with the SeDuMi solver (to share open source code) or MOSEK (amazing, with free academic licenses).
For the modeling, I use either YALMIP (amazing breadth of functionality, though additional features tend to not compose well), and CVX (restricted use cases, but I find it amazingly robust).
I haven't made the jump to Python (CVXPY support for complex variables seems missing/shaky), or Julia (I tend to write OOP-lite code and haven't been able to wrap my head around Julia's abstractions).
I've also been exploring exploiting symmetries in convex programs, see https://replab.github.io/web
My typical setup is using Pyomo for model formulation (gives me flexibility to switch out solvers with ease). I bundle multiple licenses using GAMS, it is more cost effective than purchasing individual licenses from solver companies.
* MiniZinc is my favourite tool for prototyping models. Looking into the feasibility of using it in a roduction environment as well. Typically models are small to medium sized. Also using it for recreational problem solving (e.g., Advent of Code).
* Gecode is my go-to solver for writing applications where I need more control over the solving process or I want to write a custom propagator or heuristic. Used it for scheduling, planning, and configuration.
* When in the JVM ecosystem, I've used Choco.
* I've used OR-tools some, but would like to be better at it. Mainly because OR-tools has a nice set-up with a lazy clause generation solver, good automatic heuristics, and a nice portfolio solver for parallel work.
* Quite often, a custom optimization heuristic is also the right tool for the job.
I've tried to use Gurobi sometimes, but for the problems I've tried, it has either been to hard to model effectively or was not a good fit. The licensing cost is a limiting factor as well.
Xpress is pretty good at performance if you put a bit more effort into getting the model right and has very reasonable pricing.
CBC is increadibly slow compared to commercial solvers. Documentation is awful and it seems like they can't even compile it sensibly anymore because they forked autotools at some point.
General solvers:
- LocalSolver: good docs, reasonable pricing, docs make it sound fast
- ORTools: Open source, we we're leaning towards this since we're not doing anything incredibly fancy and it'd be nice to embed the solver in our server.
- Gurobi: major player in the space, will charge you an arm and a few legs.
Vehicle routing specific solvers were covered in absolutely outstanding detail by "Open Source Routing Engines And Algorithms – An Overview" [1].
We were leaning towards ORTools since we didn't want to get charged by number of jobs or compute. LocalSolver also seemed promising.
Also, is there a cost-efficient way to get distance matrices for ~50 stops? Google maps and Mapbox are wicked expensive. Is PostGIS reasonable?
[1]: https://gis-ops.com/open-source-routing-engines-and-algorith...
One thing I don't like about OrTools is that it seems there are better solvers available if you compile it from source, but I failed to get it built. As a result, I use the "CBC_MIXED_INTEGER_PROGRAMMING" solver because it's built into the precompiled libraries, not because it's necessarily the best one. It's unclear to me what benefit other solvers offer; the solver doesn't seem to be where my problems typically are.
For NLP i usually go with either https://worhp.de/ or just IpOpt.
Hadn't heard of JoD - that looks really interesting. What are your experiences so far?
My experience so far: Quiet good at finding feasible solutions when constraints are complex. Really young code with a lot of potential. Not that good at finding optimas.
Although jsprit isn’t the most lightning fast VRP solver, it’s highly customizable, and fast enough. Customizability is huge for us, as we have a tonne of very particular hard and soft constraints we need to be able to represent, and jsprit has always been up to the task.
lpsolve is simple to use. Open source, with packages in most every language. Easy to embed.
CPLEX because it's fast. With CPLEX I often find it takes more time (1 - 2 orders of magnitude) to formulate/construct the problem than it does for CPLEX to actually solve it.
I would love to get the chance to use Gurobi.
After I picked up the project again a year later (around 2019ish), I stumbled across OSQP again. OSQP blew both cvxopt and MOSEK out of the water in terms of speed (up to 10 times faster) and quality of the solutions (not as sensitive to bad conditioning). Plus the C interface was quite easy to use and super easy (as far as numerics C code goes) to integrate into my larger project. I particularly liked that the C code has no external dependencies (more precisely: all external dependencies are vendored).
Often the mathematical model of the real world problem or the input data used to parametrise the model has a fair bit of approximation error (e.g. assuming parameters are deterministic when actually they are uncertain, linearising things to bash them into the MIP modelling framework, etc) , so pragmatically it doesn't often seem useful to be too bothered about getting an optimal solution to an approximate problem vs getting an approximate solution to perhaps a better model approximation of the true situation.
http://plato.asu.edu/bench.html
This one stays pretty well updated. On the MIP side:
* COPT
* SCIP
* HiGHS
seem to be top performers (of the OSS variety).IIRC OR-Tools often by default uses SCIP for its MILP backend.
Community sentiment seems to be beginning to shift toward favouring the HiGHS solver (https://github.com/ERGO-Code/HiGHS) over CBC. Something I'm keeping a close eye on.
nextmv seems to pitch itself as a generic solving ("decision automation") platform or something (unclear). But it seems that the only fleshed out product offering is for vehicle routing, based on the docs. Are there plans to offer, for instance, a solver binary that can be used to solve generic problems?
Also all the github repos under https://github.com/nextmv-io are private, so links from docs are 404.
I don't mind the overhead in developer time. Once you 'git gud' at translating problems to MIP/ LP, it's just a matter of learning syntax. The real trick is in translating problems.
It's a shame more CS courses don't focus on this problem translation meta-algorithm. You can get it in tough theory courses with problem mapping to prove complexity, but its usefulness goes way, way beyond that somewhat niche application. Learning to model problems as max flow, graph partitioning / matching, LP, MIP, gives you absolute super powers way, way beyond what you'd gain by having a solver that's easier to work with.
In the context of probabilistic modeling, my experiences with belief propagation have been good, but somewhat mixed. Sometimes it feels like mean-field methods could give more bang for the buck given how much less memory they use.
In general, I would recoment OR Tools.
After trying different stuff, and under different platforms (e.g. Matlab + CPLEX, Python + GLPK, etc), I settled first on Python or-tools with COIN-LP (about 100x faster than GLPK). Later we got access to FICO Xpress solver which in turn gave us some other gains.
I really liked this benchmarking website: http://plato.asu.edu/bench.html
It seems like they got into some trouble, so I don't know if they are still tracking the main commercial solvers.
I use JuMP to switch solvers to Mosek or an open source solver is something doesn't go as expected.
Python: PyOptSparse with IPOPT (and NSGA-II, although not the best implementation) and scipy.optimize.
I use them for the Design Optimization of aircraft with multiple non-linear constraints and sometimes multiple objectives.
I would have liked some kind of API that I could call out to instead but nothing existed at the time: you pass in the inputs to construct the linear equations and then you get back the results.
For fun I made this Golomb ruler solver using cplex: https://github.com/strateos/golomb-solver
It’s great, but you have to be able to put your problem into Excel to use it which does not work for all use cases.
In cases where I’m trying to optimise something that doesn’t quite fit into Excel cleanly I’ll usually do some sort of hand-rolled Monte Carlo optimisation. This generally works for me in the types of problems I most often solve.
In the usual Excel Solver case, for a simplified example, I want to create a biased coin where getting heads doubles your money and tails loses your money, with expected payout 90% in the long run. The parameters to change are heads/tails coin weighting, to target a value 90%. A fair coin gives a long run payout of 100%. It turns out that if a biased coin hits heads 45% of the time and tails 55% of the time, we end up with this 90% payout. Excel solver can come up with these two values.
In this example, we can come up with these 45% and 55% values theoretically. With more complex systems, we may want to see the effects of changing a subset of parameters that have an indirect effect on payout.
For example, if we extend the “biased coin game” to instead be “win your money back, 2, or 3 times your wager each with some probability” on a heads result, there are more parameters (and many solutions!) to get to 90% payout. Changing the heads probability has a further, second-order effect on the payout. Using Excel solver it’s straightforward to fix some parameters (heads/tails probability) and allow others to change (1x/2x/3x odds) to get desired results (90% payout).
If we further extend this methodology for a biased coin game, we can end up with a coin game that pays with distribution exactly like a slot machine in a casino, though it would not look like one!
I started to write up a page to track the "business of solvers".
Should be "launching" in the next week or so.
I like how it targets both MIP and CP solvers. It's actually the tool that first led me to CP and Gecode.
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Search:
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK