Skip to content

data from kaggle

Data were obtained from kaggle: The dataset contains a list of video games with sales greater than 100,000 copies. It was generated by a scrape of Fields include:

  • Rank – Ranking of overall sales
  • Name – The games name
  • Platform – Platform of the games release (i.e. PC,PS4, etc.)
  • Year – Year of the game’s release
  • Genre – Genre of the game
  • Publisher – Publisher of the game
  • NA_Sales – Sales in North America (in millions)
  • EU_Sales – Sales in Europe (in millions)
  • JP_Sales – Sales in Japan (in millions)
  • Other_Sales – Sales in the rest of the world (in millions)
  • Global_Sales – Total worldwide sales.

Global sales

Global Sales by Year_trends

Looking at how the total global sales differed by years, it can be noted that since 1980 sales were continously increasing. But huge decrease can be observed in last 7 years.

Sales by regions

Global Sales by Region

Looking at the sales by regions, we can see that North America region had the biggest contribution to global sales – almost 50%. NA and EU regions together had over 75% contribution to global sales.

Sales by regions timeseries

Here we can see how sales in the regions varied over time. In 1995 North America became the biggest market. In 2008 the sales began to decrease in all regions. The biggest relative fall was in North America.

Number of releases

Number of releases by Year

Here we can see how the number of new games releases was changing. It kept growing up to the 2009, when it started to drop rapidly.

Who released the most?

This image shows top publishers by the number of releases. Two companies that made the most releases were: Electronic Arts and Activision.

Releases by Years

Top Publishers by Releases by Years

Here we can see how the releases looked in the time. The top publisher – Electronic Arts – entered the market in 1992 and made the most releases in years: 2005-2009.

Who made the sales?

Top Publishers by Total Global Sales

Electronic Arts and Activision made most releases, but looking at the sales we can see that Nintendo made the best Total Global Sales. Electronic Arts and Activision are right behing Nintendo.

How did the sales look like?

Top Publishers by Global Sales by Years

This image shows how did the sales look in time for different publishers.


Releases by Platforms

Here we can see the number of releases for different platforms. Two top platforms were DS and PS2.

Time marks of release dates for all platforms

Platforms by Years legend

This image shows how the distribution of releases for different platform look like. Most releases for DS were in years 2004-2014, and for PS2 in years: 2000-2011. Typical release time window for main platforms lasts ca 10 years. PC had the largest release window.

Number of releases for platforms

Genres by Platforms

The number of releases in different genres for platforms. The top 5 platforms had the most releases in Action category. The most releases were made for PS2 in Sports category.

Number of releases by genres

Number of Releases by Genres

Here we can see number of releases by genres. Action Games had 20% contribution to total number of games released. Just Action and Sport games were 1/3 of all the released games.

Releases by Genres by Years

This plot shows the moving average of releases in years. In 2004 Action games detronized Sport games. Sport games had the biggest relative fall (since 2010). Action games were released the most, but after 2011 also falling.

Global sales by genres

Global Sales by Genres

Action games are top selling category.

Global Sales by Genres by Years

This plot shows the moving average of sales over time. The biggest falls can be observed for two top genres: Action and Sports.

Sales by genres in regions

Sales by Genres by Areas

Sales by Genres by Areas tab

In Japan the most popular genre was Role-Playing (Pokemon series?). In other regions the most popular genre was Action.

Top games by global sales

Top Games by Global Sells

This image shows the best selling games. The top one is Wii Sports. GTA 5 is in the second place.

Top games by global sales in regions

Top Games by Sells

Looking on sales by regions we can see that top game – Wii Sports sales were mostly in North America. GTA 5 was almost 50:50 in EU/NA. In Japan: Pokemon games significantly outperformed other titles in sales.

Final notes

  • Over last few years there is decrease in sales and in numer of releases
  • NA and EU regions had over 75% contribution to global sales
  • Electronic Arts and Activision were the top most releasing publishers
  • Top most selling publisher was Nintendo
  • Most popular genres were: Action and Sports (34.5% of total sales)
  • Wii Sports, GTA 5 and Super Mario Bros and were best selling games

On the new site HDR Panoramas some HDR panoramas and sIBL setups will be placed. Feel free to download and use them.

HDR panorama sIBL Glade 01 renderHDR panorama sIBL Glade 01 spectrum



Criteria for detection of transiently evoked otoacoustic emissions in schoolchildren
Bartosz Trzaskowski, Edyta Pilka, W. Wiktor Jedrzejczak, Henryk Skarzynski
International Journal of Pediatric Otorhinolaryngology 79 (2015), pp. 1455-1461

The aim was to compare, on the same dataset, existing detection criteria for transiently evoked otoacoustic emissions (TEOAEs) and to select those most suitable for use with school-aged children.

TEOAEs were recorded from the ears of 187 schoolchildren (age 8–10 years) using the Otodynamics ILO 292 system with a standard click stimulus of 80 dB peSPL. Pure tone audiometry and tympanometry were also conducted. Global and half-octave-band (at 1, 1.4, 2, 2.8, 4 kHz) values of OAE signal-to-noise ratio (SNR), reproducibility, and response level were determined. These parameters were used as criteria for detection of TEOAEs. In total, 21 criteria based on the literature and 3 new ones suggested by the authors were investigated.

Pure tone audiometry and tympanometry screening generated an ear-based failure rate of 7.49%. For TEOAEs, there was a huge variability in failure rate depending on the criteria used. However, three criteria sets produced simultaneous values of sensitivity and specificity above 75%. The first of these criteria was based only on a global reproducibility threshold value above 50%; the second on certain global reproducibility and global response values; and the third involved exceeding a threshold of 50% band reproducibility. The two criteria sets with the best sensitivity were based on global reproducibility, response level, and signal-to-noise ratio (with different thresholds across frequency bands).

TEAOEs can be efficiently used to test the hearing of schoolchildren provided appropriate protocols and criteria sets are used. They are quick, repeatable, and simple to perform, even for nonaudiologically trained personnel. Criteria with high sensitivity (89%) were identified, but they had relatively high referral rates. This is not so much a problem in schoolchildren as it is in newborns because with schoolchildren pure tone audiometry and tympanometry can be performed immediately or at a follow-up session. Nevertheless, high referral rates lead to increased screening cost; for that reason, three less rigorous criteria with high values of both sensitivity and specificity (75% and above) are recommended.

Comparison of wave V detection algorithms in auditory brainstem responses.
Bartosz Trzaskowski
Nowa Audiofonologia 2015; 4(2):43-52

In this paper, selected systems of automatic ABR detection described in scientific journals by different research teams were presented and compared.

Otoacoustic Emissions before and after Listening to Music on a Personal Player
Bartosz Trzaskowski, W. Wiktor Jędrzejczak, Edyta Piłka, Magdalena Cieślicka, Henryk Skarżyński
Med Sci Monit 2014; 20:1426-1431

The main aim of this study was to investigate whether listening to music on a CD player affects parameters of otoacoustic emissions. A group of 20 adults with normal hearing were tested. No statistically significant changes in either OAE parameters or PTA thresholds were found.

New paper published in Otorynolarynglogia presenting results of evaluation of system for automatic detection of auditory brainstem responses

System for automatic detection of auditory brainstem responses. II. Evaluation of the system for clinical data.
Bartosz Trzaskowski, Krzysztof Kochanek, W. Wiktor Jędrzejczak, Adam Piłka, Henryk Skarżyński
Otorynolaryngologia, 2013; 12(4): 183-189

New paper published in Otorynolaryngologia:

System for automatic detection of auditory brainstem responses. I. Characteristics and tests.
Bartosz Trzaskowski, W. Wiktor Jedrzejczak, Edyta Pilka, Krzysztof Kochanek, Henryk Skarzynski
Otorynolaryngologia, 2013; 12(3): 137-147

In the previous post I presented the response function of Nikon D70 saving images in JPG Fine format. In this article I would like to show what is the response function for this camera while saving images as NEF, which is the native format of Nikon to record raw data (RAW). Images in this format are essentially unprocessed recordings of data that was registered on a camera sensor. As unprocessed I mean, in this case, the lack of camera postprocessing such as gamma conversion, or white balance. However, processing of the data registered by the sensor already starts during the digitization. That is analog to digital conversion of the analog information from the sensor (in a form of charge) to the information in a discrete digital form. Such a quantization of continuos values is ​​associated with processing of the information. But as some inquisitive users reported, Nikon is messing and changing recorded information already at this stage.

In particular, it has been proved, that recording of images in NEF with lossless compression format used in Nikon D70, is lossy, which is in oposition to the information given by the manufacturer. For the most part, the loss of information is associated with lowering the resolution in the highlights. The Nikon D70 is equipped with Sony ICX413AQ sensor and 12-bit analog-to-digital convertor. 12-bit resolution allows to record 2^12=4096 levels of brightness. But when converting to RAW, the number of levels is limited to 683 and after that, the stripped data are passed to perform lossless compression on them, similar to those used in ZIP files. While lossless compression is lossless indeed, the information is lost at the stage of quantization to 683 discrete brightness value. Quantization curve is saved in NEF files. This encoding preserves the full dynamic range, but conversion of 12-bit information (4096 levels ) into 683 discrete values ​causes the decrease of resolution in brightness. The shape of the quantization curve, which starts linear and and then increases quadratically, results in lowering the resolution while increasing brightness. Probably, the purpose of this type of conversion was to gain a significant speed up (almost an order of magnitude) during in-camera processing and recording of NEF files. Older models (such as the D1H and D100) were able to process the image even for 20-30 seconds performing compression before saving. Nowadays some of the recent camera models allow to select the recording mode of NEF files: 12 or 14-bit formats, and compressed or uncompressed files.

Additionally, some loss of information in D70 is related to use of the optical low-pass filter designed to remove high-frequency components from the image. Camera sensor when capturing the scene as is actually sampling it with a certain frequency called the sampling rate. However, if the scene contains frequencies higher than half of the sampling frequency, then in the recorded image aliasing will occure, which is known to photographers or graphic artists as moiré effect. To counteract this phenomenon, the digital camera manufacturers have been placing a blurring filter in front of a sensor for many years. This solution ensured that frequencies higher than acceptable for a given sensor will be filtered out, and the sensor will capture only the allowed frequencies. It is also easy to guess what is the relation between sampling frequency and the size of the matrix (in megapixels). The larger the size in MPx, the higher the sampling rate, and the higher frequencies can be recorded by a given sensor without risk that aliasing effect will occure. Recently, due to technological advances and the continuing growth of the size of recorded images, a tendency to remove this filter can be observed, which has a positive impact on improving the focus and increase the detail of the photos. Even Nikon removed the optical low-pass filter from its latest model D5300, assuming that the problem of moiré is not so importantthe with the images with size of 24 MPx.

However, despite the above-mentioned transformations, one would expect that, because the camera did not perform gamma compression (as in the case of JPG files), and the sensor in Nikon D70 is a CCD (with the charge proportional to the exposure) that the dependence of response here may be close to linear. I decided to investigate the relation so I performed the calculations presented below.

I analyzed two series of photos taken with different exposures. The photos were taken exactly as I described in a previous article, with one difference, that this time they have been saved in NEF format. Actually, they were exactly the same scenes photographed under the same conditions because series of NEFs were taken immediately before series of JPGs.

Both series of photos in NEF format look like this:

Barckets. Series 1. NEF.

Barckets. Series 1. NEF.

Barckets. Series 2. NEF.

Barckets. Series 2. NEF.

It is easy to notice that, for both series, pictures in NEF format look much darker and with more contrast than pictures in JPG format. It is so because the images in NEF format are in linear color space and images in JPG format in logarythmic. Our visual system, just like all our senses, naturally works in logarythmic way. Images presented in logarythmic color space (like for example taken with conventional film photography) seem much natural and photorealistic to us. That is why, before recording images in a final form of JPGs, the camera compressed them to logarythmic color space using power law transformation – so called gamma conversion. This caused much more natural to us distribution of pixel values in the pictures.

The actual bit resolution of Nikon D70 NEF files is: log2(683)=9.42 bits. In order to perform the calculations, I transformed NEF files into 8 -bit TIFF files. For this purpose, I used a program: dcraw written by Dave Coffin. Conversion could not introduce any automatic changes to brightness (by default dcraw stretches the histogram in such a way that 1% of the pixels is displayed as white), it should not perform gamma conversion, or pixel resampling, neither change the color space of the image. The command I used was “dcraw -T -W -g 1 1 -v -j -o 0”. NEF images captured by Nikon D70 are 3039×2014 px. The mask to select pixels for calculations, was created the same way as in the case of JPG images – as a table 30×20 px evenly spaced in the image while maintaining a 5% margin from the edge of the image.

In order to properly estimate a response curve for 14-bit NEF files, for a series of 11 exposures, with this method, according to the relationship : N*(P-1)>(Zmax-Zmin), one would need to use: N>(2^14-1)/(11-1), which is at least 1639 pixels. A matrix for a system of linear equations would occupy: (1639*11+2^14+1)*(2^14+1639)*16/1000/1000/1000=9.9GB of RAM. Of course, the function proposed by Debevec and Malik also would have to be modified to take account of the 14-bit resolution: Zmax=2^14-1 and shifting the distribution to the brightness of Z=(2^14)/2-1.

Response curves for NEF files with Nikon D70:

Nikon D70 response curve. NEF

Nikon D70 response curve. NEF

Nikon D70 response curve. NEF

Nikon D70 response curve. NEF

We can observe that the response curve for NEF format has much more steep profile than the curve for JPG coding from previous article. But it is hard to tell more about differences between them as thay are both plotted here in semilog scale. To be able to compare them in more detail we have two options: plot them in the log-log scale, or in the linear-linear scale.

Plot with two axis linear for series 1:

Nikon D70 response curve. NEF. lin-lin

Nikon D70 response curve. NEF. lin-lin

Plot with two axis logarythmic for series 1:

Nikon D70 response curve. NEF. log-log

Nikon D70 response curve. NEF. log-log

In both these figures, the response function for NEF format is linear up to the range of saturation.

In the log-log plot, points localized in the bottom-left part of the space, can cause some serious concerns to the quality of the response curve estimation. The big dispersion of the points along X axis (logarythmioc exposure value) and big gaps between consequtive values along Y axis (logarythmic pixel value). The question of big gaps along Y axies becomes clear if we realize what exactly the transformation from linear to logarythmic scale does? The logarythmic scale makes the low values widely spread along axis, and keeps the higher ones more and more compressed. So, the big gaps between values in this region are the natural consequence of the tranformation of linear values into the logarythmic scale. We can evaluate, that in the half of the Y axis (in 0-3 range), there are exp(3)=20 pixels of all 256. Which means that in this scale, half of the space is taken by less than 8% of all possible pixel values. In the linear scale those values would take only a neglibly little space in the bottom-left corner of the plot. And the question of dispersion along X axis can be easily explained when taking under consideration the characteristic tendency of CCD sensors to register noise in low expositions, and the fact that half of the plot is taken by the 20 pixels for the lowest expositions. That is why, in the process of response curve estimation a weighting function, decreasing their weight in the calculations was used.

But what is the profile of the function for JPG formats in these scales?
Plot lin-lin:

Nikon D70 response curve. JPG. lin-lin

Nikon D70 response curve. JPG. lin-lin

Plot log-log:

Nikon D70 response curve. JPG. log-log

Nikon D70 response curve. JPG. log-log

On both of these plots, a nonlinear, gamma-like conversion can be observed.

Based on these both articles and presented results the conclusion about Nikon D70 can be made:

  • saving pictures in JPEG format is associated with nonlinear transformation with profile presented on the plots above
  • saving images in NEF format and flat conversion to 8-bit TIFFs by dcraw software, produces images with linear (in a certain range) response function.

In 1997 Debevec and Malik published an interesting work. They presented there a method that allows to estimate the response function of the image formation system based on a series of photos taken at different exposures. This method finds a function that minimizes (in the terms of the least squares) error of solution to the system of linear equations relating pixel brightness to the exposure.

In order to determine the response function of the Nikon D70 digital SLR encoded by the camera as JPGs, three series of shots have been taken. Each series consisted of eleven consecutive shots taken with different exposure settings changed with 1EV step, in the [-5EV, 5EV] interval around the correct exposure value.

The scenes selected for photos contained only static parts and neutral colors. Entire frame was filled with gray colors without elements with high saturation. The scenes contain areas with contrasting brightness to assure a wide distribution of points in [exposure, pixel value] space for single exposure. This was to improve the quality of stitching curves to each other for individual pixels in the least squares method calculations.

Examples of two series of images used in the calculations are shown in the figures below.

Series 1:

Brackets. Series 1. JPG

Brackets. Series 1. JPG

Series 2:

Brackets. Series 2. JPG

Brackets. Series 2. JPG

The photos were taken using a tripod, sequentially one after the other, in the shortest possible time intervals which were allowed by the software remotely triggering the camera shutter from a smartphone by USB OTG cable. It was important to take the pictures in the shortest possible time, to avoid changes in lighting conditions during shooting. All series of photos were taken on a cloudy day, when the sun was completely behind the clouds. All the shots were taken using a fixed aperture number, and a variable exposure value was obtained by operating the shutter speed. This allowed to avoid the problems associated with the change of depth of field and vignetting. In order to minimize noise component from the sensor, the minimal allowed sensitivity of 200 ISO was set. The size of the images was set to maximum.

In the calculations, the exact values of shutter time displayed by a camera (eg. 1/500, 1/250 or 1/125) were used, even though Debevec and Malik suggested that the time resulting from exponentiation of 2 (eg. 1/512, 1/256 or 1/128 ) is better approximation of the actual exposure time.

In terms of accuracy to performance ratio of the computations, it is essential to select the adequate number of pixels in the image, which will be used in calculations. Nikon D70 has a 6 megapixel sensor, and the biggest images recorded by the camera in Fine JPG quality is 3008×2000 pixels. The system of linear equations which minimizes the error of estimation of the transfer function is of the order: N*P+Zmax-Zmin. The use of all 6,016,000 pixels, in the series of 11 exposures for 8-bit images, would require to allocate about (3008*2000*11+2^8+1)*(2^8+3008*2000)*8/1000/1000/1000=3,185,066 GB of RAM. Maybe in the future it will be possible to perform such calculations on a cell phone, but they are not currently (according to the author’s knowledge) feasible. Actually, to determine this system of equations, it is sufficient to take into account N pixels that obey inequality: N*(P-1)>(Zmax-Zmin), where N – is the number of pixels , P – the number of exposures; and Zmax-Zmin is the maximum difference in brightness of pixels. With 11 exposures and 8-bit images, a sufficient number of pixels would be: N>(2^8-1)/(11-1) that is N>=26.

The required minimum number of pixels needed to calculate the response curve is known, but it is still necessary to determine how to sample the picture. In the paper, Debevec and Malik manually selected the pixels to be taken into account in the calculations. Here, I decided to select for the calculation, the array of 600 (30×20) pixels, spaced at uniform intervals from each other over the image, while maintaining a 5% margin from the edges of the picture. Due to the fact that the number of pixels was an order of magnitude larger than required, the system was characterized by a high redundancy. It occupied (30*20*11^8+2+1)*(2^8+30*20)*8/1000/1000=46.96MB of RAM. Calculations on typical computers were a matter of seconds.

The following figures show response functions obtained for Nikon D70 with JPG Fine encoding for the three (RGB) colors. In order to decrease impact of extreme brightness values ​​for the estimate of the transfer function, a triangular window function defining the importance of the brightness of pixels has been applied. The illustrations also present points in [exposure , pixel value] space, which were used to estimate the response function.

The response curve of a Nikon D70 camera with JPEG format encoding. Series 1.

Nikon D70 response curve. JPG

Nikon D70 response curve. JPG

The response curve of a Nikon D70 camera with JPEG format encoding. Series 2.

Nikon D70 response curve. JPG

Nikon D70 response curve. JPG

The obtained curves are identical. They show the response curve of the camera for JPEG files.

The site has been moved to a new server. Please report all errors here.