2017 Stats Thread

There are three lost points so far that really annoy me: 1 against RSL (even with a weakened side still should have gotten a tie, especially scoring 4 minutes into the game) and the two against the Revs (ugh). Having those 3 points would make NYCFC second in the table and in line to have 30 points at the midway point (with 27 points now, need 1 win in next 2 games) - putting them on target for 60 points and a sure top 2 seed.

In any case they will have to improve in the second half for that top 2 seed and stop blowing results.

It will be different without a brutal condensed away schedule and our players back from injury and international duty. I hate making excuses, but it's obvious what happened.
 
Here are the odds for our next 3-game stretch, based on predictions published on the 538 website.

https://projects.fivethirtyeight.com/soccer-predictions/mls/

I am a little surprised that our expected points (4.66) are so low for the next 3 games. It is actually slightly less than it was for our last 3 games (which were 4.80 vs. 4.0 actual). The 538 bot seems very pessimistic about our chances at New Jersey. I think people on this forum would be disappointed with anything fewer than 5 or 6 points, and with good reason.

[EDITED to update with correct chart]

Game Predictions.jpg
 
Last edited:
Here are the odds for our next 3-game stretch, based on predictions published on the 538 website.

https://projects.fivethirtyeight.com/soccer-predictions/mls/

I am a little surprised that our expected points (4.77) are so low for the next 3 games. It is actually slightly less than it was for our last 3 games (which were 4.80 vs. 4.0 actual). The 538 bot seems very pessimistic about our chances at New Jersey. I think people on this forum would be disappointed with anything fewer than 5 or 6 points, and with good reason.

View attachment 7126
Can you help me understand the expected results section? Why should the outcome of the first two games impact the odds in the thrd game? Its three independent calculations right?
 
Can you help me understand the expected results section? Why should the outcome of the first two games impact the odds in the thrd game? Its three independent calculations right?

It is 3 independent calculations, but to make it fit in a typical excel sheet, I need to run the first 2 of them down 1 side. So, you see all 27 outcomes. Take for example the chances we beat Seattle, tie NJ and then beat Minnesota. That one is represented by WD on the left and W across the top. The odds that we win the first and draw the second are 12.8% (51% times 25%). Then you multiply that by the 54% chance we win the last game. Final odds of win-draw-win equal 3.8%.
 
I think there's a mistake. Just eyeballing it I noticed that you have the W-L-W chance at 3.6%. But W-L-W has to be the result with the highest chance, because we have a 51% chance of a win, followed by 48% chance of loss, followed by 54% chance of a win. No other outcome can be as high. Yet others are much higher, including L-W-W at 13.2%, which should be much less likely. Yet everything adds up to 100%, or close enough given rounding quirks, so maybe some got switched?
 
It is 3 independent calculations, but to make it fit in a typical excel sheet, I need to run the first 2 of them down 1 side. So, you see all 27 outcomes. Take for example the chances we beat Seattle, tie NJ and then beat Minnesota. That one is represented by WD on the left and W across the top. The odds that we win the first and draw the second are 12.8% (51% times 25%). Then you multiply that by the 54% chance we win the last game. Final odds of win-draw-win equal 3.8%.
But I'm not certain the math is correct for the last calculation. For example, the chances of winning our first two are at 13.8%, so you would expect all of the outcomes across that row to add up to 13.8%, but they add up to 15.2%.
 
I think there's a mistake. Just eyeballing it I noticed that you have the W-L-W chance at 3.6%. But W-L-W has to be the result with the highest chance, because we have a 51% chance of a win, followed by 48% chance of loss, followed by 54% chance of a win. No other outcome can be as high. Yet others are much higher, including L-W-W at 13.2%, which should be much less likely. Yet everything adds up to 100%, or close enough given rounding quirks, so maybe some got switched?
But I'm not certain the math is correct for the last calculation. For example, the chances of winning our first two are at 13.8%, so you would expect all of the outcomes across that row to add up to 13.8%, but they add up to 15.2%.

Shit. You guys are totally right. I copied the tables for the last set of 3 games and pasted them to a new area, and it screwed up the calculations. Here is the correct table.

You will see the total probabilities sum to only 99%, which is due to rounding by 538 of the odds against Minnesota.

Game Predictions.png
 
I think there is a lot of truth to this - if you run the chances on 3 straight games, the mean outcome is going to be closer to 4-4.5 points than would seem to be the case. Still, I think 538's odds are too low for us. They still rate our defense poorly for some reason. I would think our real chances in each game are a little better than what they show.

They rate our defense poorly because the data are trailing for one or two seasons. So we have the dumpster fire of year two and maybe year one to contend with for our defensive weighting. That's going to give us a bad defensive rating, but I can see why they do it; more data points in the spreadsheet.
 
They rate our defense poorly because the data are trailing for one or two seasons. So we have the dumpster fire of year two and maybe year one to contend with for our defensive weighting. That's going to give us a bad defensive rating, but I can see why they do it; more data points in the spreadsheet.
The other rather huge factor is that our consensus is that our best 11 is top 1-3 in the league, and many would limit that to the top 1. So let's assume we're right. 538 has no data of us playing our best 11. Herrera has played 2 full games out of 15. Since we're close to halfway through the season, we could get to the last month before the data catches up to current status.
 
Interesting analysis, shows how "unlucky" NYCFC has been. NYCFC and CHI only 2 teams to have positive xG at home and on road.

Different xG models may produce slightly different results, but my tally is that we would be 12-3 based on xG. That would be tied for the best in the league with SKC. East as a whole would stack up like this:

NYCFC - 12-3
Orlando - 10-5
Chicago - 9-5
Toronto - 9-6
New England - 9-6
NYRB - 7-8
Atlanta - 6-7
Columbus - 7-9
Philadelphia - 6-8
Montreal - 5-7
DC United - 3-11


The other thing I had done is try normalize for home/away imbalance by projecting points for every team based on remaining home and away matches and using their home and away PPG to date. It fails to account for strength of schedule and some other relevant data, but it eliminates the home/away imbalance which is so relevant in MLS given the extreme skews. That produces the following projected point totals:

Toronto - 66.8
Chicago - 60.7
NYCFC - 55.3
Atlanta - 49.7
Orlando - 49.1
New England - 47.7
Montreal - 45.3
NYRB - 43.4
Columbus - 40.4
Philadelphia - 38.9
DC United - 36.8
 
Different xG models may produce slightly different results, but my tally is that we would be 12-3 based on xG. That would be tied for the best in the league with SKC. East as a whole would stack up like this:

NYCFC - 12-3
Orlando - 10-5
Chicago - 9-5
Toronto - 9-6
New England - 9-6
NYRB - 7-8
Atlanta - 6-7
Columbus - 7-9
Philadelphia - 6-8
Montreal - 5-7
DC United - 3-11


The other thing I had done is try normalize for home/away imbalance by projecting points for every team based on remaining home and away matches and using their home and away PPG to date. It fails to account for strength of schedule and some other relevant data, but it eliminates the home/away imbalance which is so relevant in MLS given the extreme skews. That produces the following projected point totals:

Toronto - 66.8
Chicago - 60.7
NYCFC - 55.3
Atlanta - 49.7
Orlando - 49.1
New England - 47.7
Montreal - 45.3
NYRB - 43.4
Columbus - 40.4
Philadelphia - 38.9
DC United - 36.8

I don't suppose you projected future point totals based on xG, with or without a H/A correction?
 
Looking at the xG table here:
http://www.americansocceranalysis.com/team-xg-2017/

It makes us feel good because NYC has the highest xG differential in the league at +9.4, and NYC's actual GD of 8 is less than the xG differential by 1.4, which suggests that we've been unlucky, a bit.

Then I look at some of the really big discrepancies between actual GD and xGD.
Seattle unlucky by -10
Dallas lucky by 7.5
TFC lucky by 8.5
Atlanta lucky by 13.6

This makes me wonder. Atlanta especially. Atlanta's "luck" is almost all on the goals scored side of things. They scored 27, with xG of 15. But xG is based solely on shot location, and ignores the defensive positioning. Atlanta scores a lot of fast break, counter goals where the defense is scrambling and out of position. those are very high probability shots in a way that xG ignores.
NYC does not score or generate shots nearly as much that way. NYC takes a lot of shots inside the box while the other team has defenders in place getting in the way. This happened repeatedly in both games last week.
xG doesn't capture this difference and seems like a major flaw.
 
Back
Top