Where in the HypeCycle is GraphQL in 2021? Analyzing public data from Google Trends, StackOverflow, GitHub and HackerNews
Is the GraphQL hype over? Was it just a trend? If you're a regular on reddit/r/graphql you might have noticed the discussion about GraphQL and Google Trends recently.
If you look at a single graph from Google Trends, you might be thinking that GraphQL is indeed through the hype cycle and interest is declining. But is that really the case? Is the decline of the Google Trends graph indicating the end of the GraphQL hype or is it just a correction due to other factors?
To answer the question, we'll look at publicly available data from StackOverflow, GitHub and HackerNews.
Additionally, I'll explain the methodology so that you can verify my results or apply the same pattern to other trends you're interested in.
Let's start by looking at 7 graphs so that we can understand the full picture.
Google Trends for the topic GraphQL#
Of all the datasources available to understand tech trends, Google Trends is the easiest to access. That said, looking at Google Trends alone might be misleading as we'll find out throughout this post.
Overall, GraphQL is trending upwards. However, from early 2020 on, there's a clear dent which doesn't seem like we're able to recover from it. It could be the case that we'll never reach the highs of 2019-2020 again.
Hacker News Mentions of GraphQL in comments, titles or URLs#
In order to search through historic data on Hacker News, we have to use publicly available data with BigQuery. I'll come back to how you can access this data later.
Looking at the graph, we can also identify an overall upwards trend. The dent from Google Trends seems to not replicate.
Hacker News Mentions cumulative of GraphQL in comments, titles or URLs#
If we add up all mentions on HN the picture gets even clearer. Would you call it a hockey stick?
Repos created on GitHub with GraphQL in the name#
Next up, we'll look at repositories created on GitHub containing the term "graphql" in the repo name. Keep in mind that these are only publicly available repositories.
Accessing this data requires us to use BigQuery again. There's a section at the bottom on how to access it.
Like the Google Trends graph, we have an upwards trend with a dent from early 2020 on. What's different is that the Google Trends graph dropped by ~40% whereas repos on GitHub only saw a small correction.
Repos created on GitHub cumulative with GraphQL in the name#
If we add up all the repos created on GitHub we get a hockey stick again.
StackOverflow Mentions of the topic GraphQL#
Another good indicator could be StackOverflow. Similarly to the GitHub data, it's not as easy to get these results. Luckily, the StackExchange Data Explorer allows us to query the historic data of StackOverflow.
If we look at mentions of the topic "graphql" on SO, we can see an overall upwards trend which also replicates the dent in early 2020.
StackOverflow Mentions of the topic GraphQL cumulative#
Adding up all mentions on SO gives us another hockey stick.
Views on StackOverflow with the topic GraphQL#
Views on StackOverflow caught by far most of my attention as they paint a different picture than all the other graphs. The graph had its highs between 2017 and 2018, from there on it's in a steady decline towards zero.
Views on StackOverflow cumulative with the topic GraphQL#
If we look at views on SO cumulative, this get's even more clear as the graph is asymptotic.
It's a very human thing that we want to make sense of the world. Looking at this graph, you might be tempted to immediately explain to yourself why views on SO are not growing while new mentions add up frequently.
If someone presents you statistics, it's always a good thing to bring data into perspective. Never look at one single graph. Never look at just one dataset.
If we look at other graphs from StackOverflow, we can see something very interesting.
Let's look at cumulative view on StackOverflow for another popular technology with a similar lifespan: Kubernetes
Mentions of the topic Kubernetes on StackOverflow#
Similarly to GraphQL, mentions of the topic Kubernetes are growing steadily. If you look closely, you can also spot a dent from early 2020 on.
Views of the topic Kubernetes cumulative on StackOverflow#
Cumulative views for the topic Kubernetes on StackOverflow paints a similar picture, it's also asymptotic.
What is wrong with StackOverflow?#
Why do we see a lot of new questions being asked on SO but no new views? I've tried this with other topics as well, it's always the same picture.
To find an answer, have a look at SO yourself and browse through one of the topics. What you'll find is a low of new questions without an answer or an upvote. Some questions even get voted down.
What does this mean?
Well, after the initial "honeymoon phase" people seem to stop answering questions on new technologies. Maybe this is just the way StackOverflow works, maybe there's something wrong with it, I'm not sure. The "experts" seem to move on or don't see value in answering questions continuously. Maybe there's not enough benefit for them to keep doing this work.
If you have any clue, please let me know!
That said, I don't see how this overall trend on SO views can be used to judge GraphQL as a technology.
I can see a lot of upwards trends for GraphQL with corrections from early 2020 on. I think this is due to the pandemic. I also think it's just a correction, and we'll recover from it. That said, I'm not able to prove it, don't trust me.
Look at the raw data and make up your own mind.
That said, I think StackOverflow is suspicious. Something is going on there.
Now, onto the fun part! Let's give you the tools, so you can analyze these datasets on your own.
Analyzing tech trends using public data from GitHub via BigQuery#
You can search through repositories in GitHub using BigQuery public datasets: https://cloud.google.com/bigquery/public-data
To query the GitHub dataset, use the following query:
selectconcat(cast(extract(year from created_at) as string),'-',cast(extract(month from created_at) as string)) as year_month,count(1) as reposFROM `githubarchive.month.*`where repo.name like '%graphql%'and _table_suffix between '201501' and '201512'group by year_monthorder by year_month
Analyzing tech trends using public data from Hacker News via BigQuery#
You can search through all the data on Hacker News using BigQuery public datasets: https://cloud.google.com/bigquery/public-data
Here's the query used to create the graphs above.
selectconcat(cast(extract(year from timestamp) as string),'-',cast(extract(month from timestamp) as string)) as year_month,count(1) as mentionsFROM `bigquery-public-data.hacker_news.full`where lower(text) like '%graphql%'or lower(title) like '%graphql%'or lower(url) like '%graphql%'group by year_monthorder by year_month
Analyzing tech trends using public data from StackOverflow via StackExchange Data Explorer#
Querying data on StackOverflow is a bit different from the other options. StackExchange offers the "Data Explorer" which allows us to compose queries and ask for data: https://data.stackexchange.com/stackoverflow/query/new
You can use the following query as a starting point:
with mentions_per_month as (selectconcat(datepart(year, CreationDate),'-', datepart(month, CreationDate)) as year_month,count(1) as mentions,sum(ViewCount) as viewsfromPostswhereTitle LIKE '%##Topic##%'/*anddatepart(year,CreationDate) LIKE '2016' */group byconcat(datepart(year, CreationDate),'-', datepart(month, CreationDate)))selectyear_month,mentions,sum(mentions) over (order by year_month) as mentions_cum,views,sum(views) over (order by year_month) as views_cumfrommentions_per_monthorder byyear_month