Hey Axem,
First, I don't think we've ever actually crossed paths in any discussion threads, but to be crystal clear: my name is Matthew Enthoven, and I'm one of the founders of OpenCritic. I've monitored and commented on discussions about Wikipedia's use of review aggregators in the past, and saw your initial upstart of the thread back when we first launched.
We're still trying to figure out ways to make strides when it comes to Wikipedia. Previous conversations seemed to mostly conclude "too soon" and that we weren't "enough of a source in the industry." We wanted to continue to challenge that and get more feedback. Since the start of this year, we've added numerous features and seen our presence as an authority rising, so we thought it'd be a good time to ask again "what is it that you guys look for?"
We've added critic pages, with over 350 critics that have signed up and customized their page. To this day, we are the only aggregator that correctly attributes reviews to their author in addition to their publication.
We also added support for embeddable scores, which are now being used by The Escapist (see bottom of article) and Lazygamer. Websites such as Cubed3 and DarkZero now link to us in their footers, and PlayStation Universe lists us on their reviews.
We've been used as a source by Gamasutra (second paragraph), GeForce/Nvidia (see last paragraph), Examiner, Forbes, and others. We've also been added to Wikipedia Portugal on many pages. In the community, we're an officially sanctioned aggregator by the PS4 subreddit, and have been used across several reddit threads, often times as the only aggregator listed now. Metacritic has even made significant score mistakes, and a few of our users noticed.
We passed 100 publications included, and added word clouds that highlight key features and themes of reviews. We continue to see more and more traction across the board. We're adding 3DS and Vita titles now, with Fire Emblem Fates' review embargo already posted. We're the only aggregator that includes publications such as Eurogamer, AngryCentaurGaming, GameXplain, and TotalBiscuit, and we're the only aggregator that maintains the original score format. We also report on the percentage of critics that recommend the title, a statistic that allows us to include non-numeric publications. Finally, we continue to be the only website that's reliably and systematically publishing review embargo times.
We strongly believe that we are the fastest and most reliable aggregator. We are consistently faster than Metacritic, as several critics have noticed. We've invested heavily in our technology and our presentation, and believe strongly that, while we draw on the same data as Metacritic, we offer a more complete and informed picture fo a title.
The reason I'm writing is: We really want to know what you guys are looking for. This isn't a "please put us on Wikipedia" type thing: we're young gamers and don't really consider Wikipedia readers to be our demographic, and as we have no advertising, they'd be revenue-negative anyway. Instead, we're just looking for feedback. We consider you, as a video game editor, to be an intellectual in the industry that we want to support and thrive in. So we want to know - what do you look for when evaluating OpenCritic as an "industry source"? What are the variables/factors? What are the things we can improve?
We're always on the lookout for ideas, and as we wrap up our next few features, we want to get your thoughts and opinions.
Sincerely,
MattEnth (talk) 05:31, 18 February 2016 (UTC)
- @MattEnth: Hi Matt, thanks for reaching out to me. I think it's great that you guys have been doing so well in the press and in your outreach stuff. I'll admit that I don't frequent OpenCritic that much, but that's mostly a function of the fact that I don't frequent aggregator websites that much. However, all of the coverage and evidence you've posted is very heartening and I'll definitely refer to it in future discussions. As for said future discussions, it's no secret that I am extremely unhappy with the current state of 1) the games industry's over-reliance on Metacritic as a metric for success, and 2) Wikipedia's (WP:VG's) contribution to that over-reliance by using only Metacritic scores to shape the tone of the Reception sections in game articles. Recently, there was a consensus reached at WP:VG to deprecate the use of GameRankings in review tables (a discussion which I missed, to my regret) and I think it sets a dangerous precedent where Metacritic can monopolize the tone of the discussion. I believe that the mere presence of a 2nd aggregator, even if imperfect and/or drawn from the same data, has an ameliorating effect on Metacritic's dominating mindshare.
- ALL THAT BEING SAID, it's only been 4 months since you guys launched and your traffic stats are orders of magnitude lower than Metacritic. There's not much you guys can do at this point, other than getting older and more established, which I think will help your standing. You should definitely keep spreading your wings with those embeddable scores and see if you can get more coverage comparing accuracy between yourself and Metacritic. For my personal edification, I'd love if there were some way for you guys to start working your way backwards in time to cover older games (so that data could potentially exist to cite in older articles here). I don't know when the big battle for OpenCritic's inclusion in the review table will be (my guess is it won't be within the next few months), but I'm definitely on your side when it happens. Axem Titanium (talk) 20:53, 18 February 2016 (UTC)
- Thanks for the reply! It's funny, but we took the word "Kingmaker" from Wikipedia back when we watched the conversations. And yes, we watched the GameRankings discussions too. We were a bit surprised with the outcome, as having ONLY Metacritic seems really odd.
- On the note of older and historic reviews, there are just so many challenges. For one, a lot of older reviews just don't even exist anymore. If you go to Ocarina of Time's original Metacritic page, only 2 of the 22 reviews still point to valid URLs. For legal reasons, we also can't just "scrape Metacritic," as it would be a strong copyright violation (similar to how it's illegal to copy phone books, or copy maps).
- Lastly, but on the note of traffic, we are still growing, but I hope that Metacritic's traffic isn't the bar. Recall that Metacritic places a heavy emphasis on both movies and TV shows, and is perhaps more widely known for what they do with movies. Catching them in traffic as a games-only aggregator will be simply impossible.
- Anyways, thanks for your feedback! MattEnth (talk) 23:18, 18 February 2016 (UTC)
- Yes, I was very disheartened by the GameRankings discussion. I recently fought to fix the reviews table, which had been mistakenly changed to forcibly hide GameRankings links by default, even if included (while simultaneously requiring Metacritic links) to a version which uses both as optional parameters. I understand the difficulty with trying to go back as far as Ocarina of Time, but any attempt to start working backwards in time from October 1, 2015 would be appreciated. And yes, I certainly wouldn't expect or require OpenCritic to reach Metacritic numbers to be considered for inclusion; I merely pointed it out as an aside to illustrate the current gulf in reach/maturity between the two websites. Traffic volume is unlikely to be a factor in any future discussions. Axem Titanium (talk) 03:48, 19 February 2016 (UTC)
- Hey Axem,
- Thought I'd start up a new chain. Thanks for your comments on my talk page - it's been very helpful.
- We've recently learned that two major industry players - United Talent Agency and a major PR agency (keeping them anonymous until I have explicit permission) - have begun including OpenCritic in their reports per their own client's requests. Specifically, they're using OpenCritic to leverage our % ranking field and % recommended fields, along with our ability to rapidly export review data. They also appreciate a focused context on reviews that they know their clients can go and read themselves.
- We're still chugging along to try to be included in that reviews widget. Your old feedback - going back in time - is something we hope to accomplish now by the end of the year.
- Just wanted to open up this channel again and ask if you had any pointed feedback. We've made some significant updates and have had some significant growth. We've been used as a source by Ars Technica, Forbes, Examiner, Destructoid, Wikipedia Portugal, and Insomniac Games. Remedy Games even now lists us alongside Metacritic. We've doubled our Twitter followers in the last 90 days and are seeing similar Twitter engagement to Metacritic games despite an order-of-magnitude difference in following.
- To be clear, we aren't asking for inclusion - we just want to know what you guys look for in terms of "being an authority." The OpenCritic team is very driven by measurable goals and KPIs and we conduct A/B tests all the time. Unfortunately, the goal of "be an authority" is hard to measure and thus why we turn to y'all for your thoughts.
- MattEnth (talk) 07:37, 15 July 2016 (UTC)
- Hi @MattEnth:, sorry I've been a bit busy this past week. We've already been over the features I'd still like to see. At this point, I'm planning to propose a major overhaul of the way aggregated reviews are summarized and presented in the VG Reviews box with the goals of 1) exposing and highlighting the raw number of reviews aggregated (a 95 average score based on 5 reviews is very different than a 95 average score based on 50 reviews), 2) including the spread of the review scores, probably with a standard deviation metric if available, and 3) the touted % recommended metric that you guys have been working on. I'm also hoping to include tooltips or some kind of collapsible explanation box of the methodologies of all the aggregators we use in order to prevent the spread of misinformation about how statistics work.
- When we last spoke, I believe you guys were going to different publications to ask them the parameters they'd like to set for a review of theirs to be considered a "recommendation". That was what was holding up the process of featuring the % recommended metric more prominently on your game summary pages, iirc. Do you have an update on this process? I would be happy to launch the proposal to coincide with your launch of the feature. Axem Titanium (talk) 20:11, 22 July 2016 (UTC)
- Hi Axem Titanium. I completely missed your previous message, and for that I am sorry. We have now given publications the ability to edit when a game is "recommended" and when it isn't, but most seem to be sticking with our 8.0 cutoff. Contributors are obligated to report it each time.
- It's still unlikely that we'll switch metrics in the near future. We haven't found a good path to actually doing the conversion. The issue is that OpenCritic must continue to be used in conversations if we're to continue growing, and most of those conversations still center on average review scores. We're ready to make the switch at any time, but it's going to require a concerted and focused effort from across the industry, not just one player.
- Also, horray, OpenCritic has now been live for a year ^_^
- I also just wanted to mention our recent growth, specifically passing Alexa 100k and continuing to be included in review threads. In some cases, we are now the only aggregator displayed. (like Battlefield 1 on neogaf and the Battlefield subreddit). Neogaf was particularly interesting as there is not a single Metacritic link in the 8 pages of discussion. GeForce used us in their Shadow Warrior 2 coverage. This has started to become a weekly occurrence.
- In temrs of wikipedia, we still (obviously) feel that the current standing with Metacritic as the only aggregator gives them undue weight. We've spent the last year establishing ourselves as an authority in the industry and continue to want your feedback for how we can demonstrate that.
- I also do hope you believe me when I say we want your feedback. I worry sometimes that I come across as begging, which isn't my intent at all. We've used a lot of the stuff we've gotten from Wikipedia mods to help us in other conversations. We honestly don't know what we should be looking for when it comes to "being an authority," and wikipedia has been a fun and strange place where we can candidly ask "what are you looking for" and get real feedback. It's much harder to pose that question with partners. MattEnth (talk) 22:27, 18 October 2016 (UTC)
- Hi @MattEnth:, thanks for the update. Good to see that you guys have been doing well, especially among player communities, and for cracking 100k on Alexa (78k = notbad.jpg). I understand the chicken-and-egg problem you have with converting to % recommended scores. Perhaps featuring them side-by-side instead of almost below the fold, where it's currently placed, would help drive adoption? Just spitballing here.
- Here's a SUPER quick and dirty mock-up of what I mean, which takes advantage of some extra real estate to bring all the data vitals above the fold: http://imgur.com/a/CwSbK Axem Titanium (talk) 03:20, 19 October 2016 (UTC)
- We actually have a mockup pretty similar and we've thought about doing it for a while. The information, however, is not redundant. The top changes when a user customizes their trusted publications, while the "Official Score" and "Contributor Average" stay the same regardless of user preferences.MattEnth (talk) 15:16, 25 October 2016 (UTC)
- Ah ok, I didn't realize. At any rate, as they say in business, "location, location, location". I'm sure you guys can figure something out that shifts the focus toward your goals. Axem Titanium (talk) 20:41, 25 October 2016 (UTC)
- The other thing on my proverbial wishlist is that standard deviation measure. Obviously it's a pet project of mine but the reason I'd like to see SDs reported by SOMEONE ANYONE PLEASE is because there's a rule on Wikipedia that disallows performing basic statistics on existing data as a form of original research. However, if a source reported that data with an established methodology for obtaining it, then it could be cited on Wikipedia just fine.
- One concern with Standard Deviation is that the metric is skewed less by the review scores, and more by the score formats. Or to put it more simply, how many "0-5 star, whole-stars only" publications are there? Those 20-point swings can result in some wonky standard deviations.
- Another concern is that the standard deviation isn't actionable for most people. Most gamers won't know what a "high" or "low" standard deviation is. So putting this on the page is more than just listing a number; we have to find a way to contextualize it. Does a standard deviation of 12 mean polarizing? Or concensus?
- Perhaps a better metric would be number of outliers? Or some sort of "polarized score" that we create as a hybrid of standard deviation and number of reviews outside that deviation relative to the number inside? etc.MattEnth (talk) 15:16, 25 October 2016 (UTC)
- @MattEnth: That's a very good point that I didn't think of. If SD is more a reflection of "proportion of 5-star scales", then it stops being useful. From my perspective, I would like a metric that measures "polarization", as you say, i.e. did most reviewers agree on this game's quality or did they disagree? That's a useful thing to know and informs the way I'd write about the game's reception (and I thought SD would reveal that but it seems it doesn't). The number 80 doesn't tell me that because it could be 80s across the board or equal numbers of 60s and 100s, which are very different things obviously. I'd love to brainstorm more in depth about some metric that captures this fairly. Axem Titanium (talk) 20:41, 25 October 2016 (UTC)
- One of our thoughts is a "Percent Consensus." For each publication, we would ask "Is this publication within the standard deviation OR a single granularity point (whichever's larger)?"
- So to give an example, say we have 3 publications. Publication A gives 4.5 / 5 stars. Publication B gives an 8.8. Publication C gives 3 stars. and Publication D gives an 7.8/10. The average is 79 and the standard deviation is 13.7. We'd say that there's 100% consensus. Publication A translates to a 90, which is within the Standard Deviation. Publication B gives an 8.8, which is within the Standard Deviation. Publication D, on the other hand, is outside the standard deviation. However, because their publication scores result in 20-point jumps, we would instead look at this. They are within 20 points (or within 1 star rating of the average), and so we'd say that they agree with the consensus.
- Just an idea. Not very fleshed out. Would love your thoughts. (Also I realize this TALK section is getting huge - perhaps we should move to a new section?) MattEnth (talk) 22:52, 25 October 2016 (UTC)
- As I've mentioned before, I'm deeply unsatisfied with the way Metacritic is currently dominating Wikipedia and the industry. If the end result of our conversations is a more feature complete rival to it, all the better. Axem Titanium (talk) 03:17, 19 October 2016 (UTC)
- It's just frustrating for us because we're still viewed as this weird little underdog when all of our data indicates that we're now successfully established. I mean, Respawn's community manager had a choice with which review aggregator he tweeted out. He chose us. And this happens every week now.
- I know the common response is "it just takes time," and there's a lot of truth to that. My personal frustrations stem from the feeling that it's just a big Catch-22: We can't get X because we don't have authority, and we can't prove authority without X. X can be anything from licensing agreements, to advertising contracts, to getting listed on Wikipedia :-P MattEnth (talk) 00:27, 26 October 2016 (UTC)
Hey Axem,
Just wanted to follow up again. OpenCritic has continued to grow and gain prominence in the industry. We'd still love to get your feedback on how we can make progress as an authority.
Some of our progress...
And we're still in every review thread. Here are some of the recent ones...
We're in the process now of redesigning our site to have a more clean, professional, and readable look. Any thoughts on data we might want to include?MattEnth (talk) 21:13, 6 March 2017 (UTC)
- Hi @MattEnth:, thanks for reminding me about this. I've been knee deep in my own research and this project fell by the wayside. Mostly, I was hoping to turn it into a grand rethinking of how aggregate scores should be reported on Wikipedia but I realized that it's no good to try to bite off more than I can chew. I made a simple proposal to add OpenCritic to the reviews template to get the ball rolling and I'll think about perfectly packaging an argument for new methods of reporting later. Baby steps.
- With respect to data, I'm glad you asked! I've been talking to some of my data science professors about this type of problem and I realized we're both approaching it in the wrong way. This is why, in hindsight, I was hesitant to endorse your "Is this publication within the standard deviation OR a single granularity point (whichever's larger)?" proposal above. A better way to think about it is to treat individual review scores as single studies in a meta-analysis. Each reviewer is conducting a "study" on a game to come up with a result (the final score), but they're all using different measures and different scales. Unfortunately, individual reviews do not have variances since they're only a single number. However, since you guys have such a rich data-set for reviewers, you can use each reviewer's variance among his/her own reviews as the inverse weighting for a random effects model. This essentially "controls" for the fact that some reviewers only use 6.0-9.0 on a 10 point scale whereas other reviewers use the full scope of a 1-5 star scale. What do you think?
- And while we're at it, I think you should put your money where your mouth is and feature your % recommended metric more prominently on each game's page. :) Axem Titanium (talk) 20:09, 8 March 2017 (UTC)
- Thanks. We are working towards a large game details page revamp that does feature the three stats prominently: average critic score, percentile ranking, and percent recommended. I'd be curious about your thoughts regarding "percentile ranking" - we thought that might be an interesting headline metric. It gets a lot of attention in online discussions, at least - makes an 82 average a bit more tactile for a lot of gamers that don't follow review scores.
- For me personally, it's been pretty fulfilling to see us shift some conversations. We talked about this as a team last night, but we think that little bar chart is helping the industry quite a bit. People don't just screenshot the score orb - they screenshot the whole thing. And getting that context for review scores is really great. Before OpenCritic, I think a lot of gamers would have pretended that Ghost Recon: Wildland's 78 average was "terrible," when really, it's in the top 25%. Horizon Zero Dawn's 88 gets a lot of "haha, not 90" trash comments, but it's way, way more impressive to say "top 2%."
- For that reason, we've thought about switching more to that primary metric: percentile ranking. It's already the metric we use for the Mighty/Strong/Fair/Weak ranking. It seems like it's resetting some gamer expectations. It might be a more positive stat, if we can make sure that we continue to maintain transparency.
- For other data, we've been looking more and more at what we can do with it all. We store every review document and have been looking at some nlp models to start doing things like tagging games, analyzing review sentiment, etc. MattEnth (talk) 03:04, 9 March 2017 (UTC)
- I think percentile ranking with the graph is a great idea. I'm curious how often you update the bounds for each bucket in the histogram and the cdfs? Also I notice that you treat each aggregated score as the same across games (i.e. all games with an aggregate score of 89 are reported as top 1.6%). Is there a reason to do it this way as opposed to taking the true, un-rounded/truncated aggregate score and finding its exact percentile rank? Axem Titanium (talk) 03:05, 9 March 2017 (UTC)
- The reason sucks, but it's server performance. We originally architected the system to let the graph respond to your personalized outlet selection. We had to disable it at the start of February - we've had thousands of people flood OpenCritic at embargo time for major titles. Resident Evil 7 crashed us and Horizon Zero Dawn + Zelda were looking massive.
- We're revisiting the computation and will probably just standardize the graph for official outlets to save performance. This will possibly also let us compute the exact percentage rather.MattEnth (talk) 06:02, 9 March 2017 (UTC)
- To save performance, you might consider snapshotting the current graph when a game is released and use that. That way, games won't shift based on things that get released in the future. Axem Titanium (talk) 18:22, 9 March 2017 (UTC)
- The problem with snapshotting is that we can't do it with a user's list of publications. There are 200 publications right now, meaning 2^200 different combinations. To make that graph customized to the user's publication preferences, we have to recompute the scores of each game, which is effectively pulling our entire review scores database.
- As soon as we say that the graph isn't based on user preferences, the problem becomes trivial. As you note, we can cache pretty aggressively.
- It's been interesting watching that discussion/voting(?) on OpenCritic. To be honest, it's very useful - we come to you guys to get your perspective because you actually talk to us, heh. We can't really ask publishers or potential partners "Hey, what can we do to be more authoritative?" We used to, but now we're a bit past that stage: they'll look back at us and ask "you already are?" We're included in B&H Impact, Black Shell Media, fortyseven communications, and United Talent Agency's reports. We now have relationships with all major publishers, and they're very quick to jump on any errors or adjustments. We're the link that gets circulated in development offices because we publish embargo times and are much faster than Metacritic. It's a lot of fun watching Google Analytics light up disproportionately from the developer's home city.
- The bar of "does the press use OpenCritic" is interesting, because they do, but it's rare. In general, these media networks don't link outside of their network. That's part of Metacritic's strength - they've got consistent linkbacks from all CBS properties (GameSpot, Giant Bomb, GameFAQs and GameRankings) and get to negotiate as a collective.
- Just using Wikipedia's official sources:
- http://www.gameinformer.com/b/features/archive/2017/02/21/science-fiction-weekly-horizon-zero-dawn-power-rangers-bioshock.aspx (Editor-in-chief)
- http://pcworld.com/article/3093028/software/this-week-in-games-free-pc-games-free-pc-games-and-more-free-pc-games.html
- http://www.eurogamer.cz/articles/2016-01-27-dobre-rano-s-eurogamerem-streda-27-ledna
- https://www.destructoid.com/this-fan-made-star-fox-cartoon-sure-is-stellar-357663.phtml
- http://www.gamerevolution.com/news/resident-evil-2-remake-gives-capcom-producer-daily-headaches-36461
- http://www.gameplanet.co.nz/news/g5893adcfc737f/Action-RPG-Nioh-is-long-brutal-and-excellent-say-critics/
- http://www.gamezone.com/news/early-reviews-for-nier-automata-are-coming-in-and-its-lowest-score-is-a-90-3451405
- https://www.pcgamesn.com/worst-games-2016
- http://www.escapistmagazine.com/articles/view/video-games/editorials/reviews/16974-Enter-the-Gungeon-Review-Twin-Stick-Indie-Rogue-Lite#&gid=gallery_6123&pid=1
- http://www.vg247.com/2016/11/03/skyrim-special-edition-reviews-round-up-all-the-scores-so-far/
- We won't get every linkback, but we're happy with what we've got so far. But it's challenging to think of how, exactly, we'll penetrate the media networks. Vox Media (Polygon, The Verge, etc.), Future (Edge, GamesRadar, PC Gamer), Ziff Davis (PCMag, IGN), and Gamer Network (Eurogamer, Rock Paper Shotgun, USgamer, vg247). In reality, there just aren't revenue-positive stories to write in the space. Rarely is it in the publication's interest to link to an external site. Many of them are also in decline (due to the rise of video). Most of them view reviews as a medium that they must actively protect.
- But I found your "ivory tower of games journalism" comment very funny, because there really aren't many independent outlets on the Wikipedia list. Outlets like Reno Gazette Journal and Pittsburgh Post-Gazette are really interesting, because we're talking about professional journalists with their entire careers spent reporting on tech and games, working for major newspapers, and yet they aren't considered trusted in the gaming community.
- Anyways, we've got some fun stuff planned later this month. Enhanced Steam's integration proved our API enough to some other partners. Hopefully we'll be announcing soon. MattEnth (talk) 22:38, 9 March 2017 (UTC)
- Players have a chip on their shoulder because of the derision games have suffered in mainstream media. So we retreat into the loving arms of more specialized corporate interests who are happy to exploit our clicks and eyeballs. Thus we patronize potentially compromised outlets like IGN and GameSpot because they were the only ones producing content to our needs (and now corporate sponsored YouTubers and Twitch streamers are the next generation). It would be really cool if one of those video game consulting firms came out and said that they use OC for such and such reason. I mean, it would be even better if they said it was because arithmetic mean is a bad measure (How to Lie With Statistics has a whole chapter on it) but as long as we're wishing, might as well go big.
- Looking forward to your next feature rollout. P.S. Can you add dates to your blog posts? I can only guess based on context. Axem Titanium (talk) 23:43, 9 March 2017 (UTC)
Interesting - looks like Wikipedia changed talk pages! Yes, we can add dates to our blog posts.
I'll reach out to them and see if they can make a public comment, or at least "declassify" one of their reports.
And yeah, we're not huge fans of the "mean" system, either. It's just necessary to get included in gamer conversations.
The weird thing is that gamers seem to place the *opposite* emphasis than what they should. The difference between a 90 and an 89 is quite small (only 0.5% ranking). But each point in the 70s makes a 3% ranking difference. We get a lot of flack for being "almost the same as Metacritic," but in reality, those small 2-point differences are substantial: 72 vs 74 is a 7.5% ranking difference. MattEnth (talk) 23:31, 14 March 2017 (UTC)
- Gosh, that's odd. I had a different formatter for the previous talk page? I'll have to learn Wikipedia better, heh.
- Just curious, but does plagiarism matter to Wikipedia? Not sure if you ever had a chance to look, but we did send Metacritic a cease and desist after finding our data there: https://www.pcgamesn.com/the-witcher-3-wild-hunt/review-aggregator-site-opencritic-accuses-metacritic-of-using-their-data-without-consent
- Anyways, dates are now on posts.MattEnth (talk) 00:58, 15 March 2017 (UTC)
- Cool, thanks for doing both of those. The data copying thing probably belongs on the OpenCritic Wikipedia page, but it should have no bearing on the current discussion. Axem Titanium (talk) 15:17, 16 March 2017 (UTC)
Hi Axem,
I've brushed off our old comparison script and thrown a few samples here: http://pastebin.com/Vgu0zuqT
We don't want to slam Metacritic's servers so we've taken just a random sample of some high-scoring titles where we've seen differences.
In compiling this data, it did highlight the advantage of having one score to capture the critical reception of a title. For Honor, for example, is 77 on OpenCritic and 79, 80, and 76 on Metacritic. Intuitively, a user would assume the score is closer to the 79/80 than the 76, which we believe is an incorrect assumption. Furthermore, but this is a bad data presentation to a user. Larger outlets are going to get counted significantly more often: IGN, GameSpot, Polygon, etc. review copies for every platform, while Destructoid, The Escapist, and others only review a single version. In effect, you're further emphasizing the score given by the larger publications.
Edit: I also just want to highlight some of this stuff... As an example, Metacritic started aggregating this publication earlier this year. This one was added in Q4 last year. I don't want to come across as mean towards a small guy, but they're Alexa ranked 1 million, with less than 5,000 total social media followers. How did Metacritic evaluate these publications for inclusion? It seems pretty arbitrary, knowing it's one guy behind the scenes who's making this decision. Metacritic has also started claiming "translation exclusivity" to create barriers between it and OpenCritic. I strongly encourage you to do some digging on Metacritic's editorial standards and ask why they're considered a trusted authority at all. In our opinion, much of it is because "they just always have been." Some of these recent decisions aren't the actions of an authority.MattEnth (talk) 01:47, 23 March 2017 (UTC)
Also thought I'd give you some updates on looking at "degree of consensus." This is the distribution of standard deviations, and this is the scatter plot of standard deviation vs number of reviews. We are planning on adding something onto OpenCritic about this, but are trying to figure out how to best frame it. There's also a problem with the score and distribution to begin with... High-scoring games (85+) are going to naturally have a lower standard deviation because of the 100-point ceiling (plot). I also just need to sit down and really think about how to actually the benchmarks of "Critics disagree more than usual" or "Critics agree more than usual."
Let me know if you have any suggestions for how to calculate this.
MattEnth (talk) 00:30, 23 March 2017 (UTC)
- Just wanted to check in on the above.
- MattEnth (talk) 17:52, 16 May 2017 (UTC)
- Sorry, I've been super busy with my own research projects so, as you can see, I've barely been editing these past few months. Not exactly sure when it'll let up unfortunately. Still interested, but I just don't have the bandwidth to devote much time to it these days. Axem Titanium (talk) 23:00, 17 May 2017 (UTC)