TL;DR: Spanish Government has a web service to query power generation statistics. But it is locked behind some obscurity and lack of docs. I have put together some scripts to fetch all raw data. Fork it on GitHub or download a CSV with all historic data.
For some time, I have been increasingly interested in electricity demand statistics. Some witty folks are using demanda.ree.es, a Flash visualizer for power generation and demand in Spain, to measure interesting stuff such as the impact of turn-off-the-lights protests or solidarity acts (1) or the success of a general strike (1, 2, 3). These uses are really exciting and I think we can get much more out of this kind of data. So I decided to provide easy access to it.
Let's go to https://demanda.ree.es/demanda.html and fire up Google Developer Tools.
GET https://demanda.ree.es/WSVisionaV01/wsDemanda30Service?WSDL GET https://demanda.ree.es/WSVisionaV01/wsDemanda30FinoService?WSDL GET https://demanda.ree.es/WSVisionaV01/wsDemanda30Service?xsd=1 GET https://demanda.ree.es/WSVisionaV01/wsDemanda30FinoService?xsd=1 POST https://demanda.ree.es/WSVisionaV01/wsDemanda30FinoService POST https://demanda.ree.es/WSVisionaV01/wsDemanda30Service POST https://demanda.ree.es/WSVisionaV01/wsDemanda30FinoService POST https://demanda.ree.es/WSVisionaV01/wsDemanda30FinoService
Great! It's just a SOAP web service. This is going to be easy. A Python script using suds will do... well, not so fast.
The first request is a call to the
consultaTiempo, a method that returns the current time. It works. But the rest of methods require a parameter
clave (Spanish for
key). Dammit. And it's not just some constant API key that I can copy from Developer Tools to my script. It changes every few seconds and the older ones are no longer valid. Every time there is a call to
consultaTiempo, the key changes. Hey! That means it's time-based!
After a quick analysis between some timestamps and their corresponding times, I realize that keys increase at a constant rate and they are strongly correlated to the timestamp. Each unit increase in the key corresponds to roughly 1 second in time. But I cannot figure it out.
Here it is. It always calls the
consultaTiempo method just before retrieving some data:
00024) + 1:1 pushstring "consultaTiempo" 00025) + 2:1 findpropstrict <q>[private]VisionaFlexCO2::_VisionaFlexCO2_Operation2_c 00026) + 3:1 callproperty <q>[private]VisionaFlexCO2::_VisionaFlexCO2_Operation2_c, 0 params 00027) + 3:1 pushstring "prevProgFino" ... 00030) + 5:1 pushstring "valoresMaxMinFino"
Following the function calls, I get to a function called
prueba (Spanish for
test). That's funny. It looks as if they "ofuscated" the key-generation function under a non-important name. But the key is generated there, and it seems an easy operation:
00023) + 0:1 getlocal_2 00024) + 1:1 pushbyte 5 00025) + 2:1 dup. 00026) + 3:1 callproperty <q>[namespace]http://adobe.com/AS3/2006/builtin::substr, 2 params 00027) + 1:1 coerce_s 00028) + 1:1 setlocal_2 00030) + 0:1 findpropstrict <q>[public]::parseFloat 00031) + 1:1 getlocal_2 00032) + 2:1 callproperty <q>[public]::parseFloat, 1 params 00033) + 1:1 convert_d 00034) + 1:1 setlocal_3 00036) + 0:1 getlocal_3 00037) + 1:1 pushdouble 1.307000 00038) + 2:1 divide. 00039) + 1:1 convert_d 00040) + 1:1 setlocal_3 00042) + 0:1 getlocal_3 00043) + 1:1 convert_i 00044) + 1:1 setlocal r4 00046) + 0:1 findproperty <q>[private]VisionaFlexCO2::clave
Ok. Got it. So I already have the Python equivalent:
key = str(int(float(timestamp[5:10])/ 1.307000))
Yay! It works! The only thing missing are a few SOAP calls and some boring JSON dumping code.
Show me the code!
The resulting script for fetching all historic data and generatic some graphs can be found in ree-demanda repo at GitHub. Everything it's pretty basic, but it gets the job done.
I fetched all historic data from the web service and consolidated it in a single CSV file. You can find it at https://github.com/smola/ree-demanda. I plan to update it regularly and, eventually, set up a simple REST API to query it. In the mean time, if you find it is outdated, poke me at email@example.com and I will update it.
Here is a visual example of the data. Power generated since 2007 in Spain. Each line is a different power source. Data is relative to the total power demand at a given time.
What would you do with this data? Are you interested in a specific graph, crossing it with other data sources? Would it be useful for an online service? What knowledge can be obtained out of it?
If you have any idea, please, use the comments or write me at firstname.lastname@example.org.