An image sensor noise test

Here I present a simple test of three camera sensors to compare noise between them at base ISO. The test is oriented towards users interested in high resolution photography with high dynamic range, that is typically landscape photography. I do not evaluate high ISO settings at all.

How much noise there is in the raw image file is a key image quality factor. It is not the only one, but it has large impact and is the easiest to measure and has less subjective element to it than many other image quality factors such as color rendition.

The image quality of medium format systems are often described as far superior to the smaller formats, higher resolution (obvious), better color rendition, finer tonality and higher dynamic range are often claimed. Tonality and dynamic range is very much about noise which is the factor we are going to look into here.

I do this test for the following reasons:

  • Despite frequent claims of the medium format superior image quality, few actual tests look into noise performance for real, and those that do are often poorly made with misleading results.
  • Many seem to have missed the huge progress in noise performance in DSLRs that came with the introduction of the Sony Exmor sensor technology used in for example Nikon D800 and D7000.
  • Second hand medium format is today a fairly affordable option for serious amateur photographers so it is relevant to test how these perform compared to modern alternatives.
  • The popular website DxOMark that does provide high quality scientifically made measurements of many sensors including medium format presents them only in graph form which is hard to interpret, here the purpose is to show results visually and discuss the practical meaning of them.
If I had access to any camera system this test would be on the Nikon D800, Canon 5D mark 2, Phase One IQ180 and Phase One P25. I want to show how state of the art DSLR (D800) compares to start of the art medium format (IQ180), and where a popular second hand MF alternative (P25) and yesterday's DSLR king (5Dmk2) stand in relation to that.

Unfortunately I did not have access to all those systems so this test is on the following cameras:

  • Nikon D7000. An APS-C camera which has a Sony Exmor sensor, which presents a break-through in dynamic range. Pixel size is almost exactly the same as the new full-frame D800 and other well-made measurements show that per pixel it performs almost exactly the same. Thus we can consider the 100% crops from D7000 as a good representation of 100% D800 crops.
  • Canon 5D mark 2. A very popular full-frame 135 camera from Canon. Canon is currently known for not having as good dynamic range in their sensors as the competitors, but since this camera up to very recently was one of the most popular "high resolution" DSLR cameras most "MF vs DSLR" tests were made against this. Unfortunately for Canon users the recently released 5D mark 3 only provides a minor improvement in dynamic range at base ISO.
  • Leaf Aptus 75. Medium format digital back 36x48mm sensor. Released late 2005 this 33 megapixel discontinued digital back is one good alternative for an amateur that wants to invest in second hand medium format gear. Although not the latest digital back it still has comparable DR to many current backs (not as good as IQ180 though!). In the test it has been mounted to a Linhof Techno with a 90mm lens. Lens is long enough to have negligible color cast.
So we get a good representation of Sony Exmor and results comparable to D800, we get the 5D mark 2, a good second hand medium format back, but unfortunately not state of the art medium format.

The test

The test procedure is very simple. I have set up a stable test scene and made an optimal ETTR exposure (verified by raw diagnostics), and from that underexposed the scene -1 to -9 stops with one stop decrements. These files are then developed in RawTherapee with the underexposure compensated for and compared visually to come to conclusions.

My goal is not to come to an actual exact measurement value, but to visually compare these cameras against eachother which I think is more relevant for practical use.

The upper row shows the original in-camera JPEG preview of 5Dmk2 test shots, with optimal ETTR to the left and then 1 - 9 stops underexposed. The bottom row shows the same files neutrally developed in RawTherapee and then exposure compensated 1 - 9 stops to get equal brightness. Obviously noise gets worse with more stops compensated.

The test scene

Test scene with color checker and a black object in the form of a camera lens, lit with a daylight lamp.
The test scene contains a colorchecker and a black object, in this case a lens. This is lit by a daylight lamp with very high CRI, not CRI 100 though, but this test is not for color accuracy so it should be ok.

I did not have lenses with the same field of view to all cameras so the perspective is a bit different. The ETTR exposure has been calibrated after the brightest white patch on the color checker. The small symbols on the colorchecker are even brighter and are thus overexposed.

I have not cared much about focusing, so don't try to evaluate sharpness from these pictures.

The images have been developed in RawTherapee 4 with neutral settings, that is just demosaicing, white balancing according to a default color matrix, and a default gamma curve. The underexposed images have then been pushed with the exposure slider (which in this program behaves linearly when highlight recovery is disabled).

The look of the images from 5Dmk2 and Aptus75 is very similar, while D7000 is a bit more saturated. This is due to the default color matrix for the camera, no conclusions should be drawn from that. We do this test to evaluate noise, not color.

The brightness of the Aptus75 series of images varies slightly, this is due to that a technical camera with a mechanical copal shutter has been used. These shutters are not as precise as electronic shutters so there can be a 1/4 stop variation or so.

Noise in mid-range areas of an ETTR exposure

The first reference ETTR exposure represents a perfectly exposed scene. If there is no noise issues in this exposure one can assume that the camera is "good enough" for practical use.

In the color checker color patches all cameras have very low noise as shown in Figure 1. It is by eye impossible to distinguish noise levels between the different cameras.

Figure 1. Colors from the color checker in the ETTR exposure. 100% crops, from left to right: Aptus 75, 5Dmk2 and D7000. The brown color is about 4 stops below saturation.

Noise in dark shadows of an ETTR exposure

In figure 2 we look at the black object (the lens) and then we look at light levels about 8-9 stops below saturation. In the final picture this would be rendered very dark, but since a neutral setting has been used when the test images were developed from the raw files there is no contrast-enhancing S-curve applied so it is rendered a bit brighter. Note that if this would have been an actual photograph of that lens we would have used more light!

On a calibrated screen the noise is visible, at least in the 5Dmk2 crop. Most would not consider it disturbing though.

Figure 2. Detail from the black lens in the ETTR exposure. 100% crops, from left to right: Aptus 75, 5Dmk2 and D7000. The black color between the two green numbers "2" and "3" is about 8.5 stops below saturation. Note that the D7000 image may appear brighter (especially the numbers) due to that it is more in focus.
To make it visible on a darker screen figure 3 is a version with the same image content but pushed three stops. One can see that 5Dmk2 is noisest by a wide margin, then comes Aptus 75 and the D7000 is the cleanest.
Figure 3. Same as figure 2 but the images has been pushed 3 stops to make noise more visible.
Note that if we downscale to compensate for resolution the Aptus and the 5Dmk2 will gain some on the D7000. This is shown in figure 4 when all images have been downscaled to 8 megapixels. Still the order of noise is the same, but it is close between the Aptus and the D7000.
Figure 4. Same as figure 3 but the images have all been downscaled to 8 megapixel images. From left to right: Aptus 75, 5Dmk2 and D7000.

Number of noise-free stops

How many stops are "virtually noise free" in these images? This is too hard to see in the ETTR image (I did not have a transmission wedge), but by looking at the pushed underexposed test images we can see that the Aptus controls the noise well down to about 7 stops from saturation, as shown in figure 5.
Figure 5. The darkest colors we can have before noise start to become apparent on the best camera. Dark color patches from 3 stop underexposed image, then brightened to show noise, original brightness shown in bottom left corner. From left to right: Aptus 75, 5Dmk2 (bottom) and D7000. The blue patch (darkest) is about 7 stops from saturation, green -5.7, red -6.5. The dark gray forming the grid between patches is at -7.3 stops.
In figure 5 we can see that 5Dmk2 is noiser and that Aptus 75 and D7000 are very close, with the Aptus having a small edge which is a little bit larger if downscaled. The D7000 has most problem in the darkest blue color patch where the noise is a bit blotchy.

Reaching the limit

So how dark can we go before the noise is starting to become really uncontrolled? This is shown in figure 6, this is as dark as it gets before noise is taking over.
Figure 6. The darkest colors we can have before noise becomes uncontrolled (very significant noise increase to next stop darker), level chosen individual for each camera to as close as possible match eachothers' noise levels. The original brightess shown in small patch. From left to right: Aptus 75, 5Dmk2 (bottom) and D7000. D7000 green patch at about -10 stops, Aptus at -9 and 5Dmk2 at -8.
In figure 6 the cameras are at different brightness levels chosen to get roughly the same noise, but since we only have 1 stop decrements an exact match could not be found. Anyway, this puts the noisiest 5Dmk2 at the brightest level, then comes the Aptus and the D7000 has the cleanest dark shadows, just as we saw on the black object. So we can conclude that the Aptus has a (very) slight edge from saturation down to about -7 stops, then the D7000 through its very low noise levels from the chip takes over and has the cleanest dark shadows.
Figure 7. One stop darker than figure 6, here the noise is considered "uncontrolled". From left to right: Aptus 75, 5Dmk2 (bottom) and D7000. Same spacing as before, D7000 green patch at about -11 stops, Aptus at -10 and 5Dmk2 at -9. Not seen in this crops, but in full size image the Aptus show clear noise uneveness due to dark noise, and the D7000 has some mild horizontal streaks. The 5Dmk2 has had patterned noise from the start.

Backlit scene shadow pushing

The obvious use of high dynamic range for a landscape photographer is to capture a backlit scene (say a sunset) without the use of a gradient filter or multiple exposures. This would mean that we underexpose the foreground with 2-3 stops, and then push it in post-processing.

In a typical sunset example the dark foreground will be 7-8 stops from saturation. Figure 8 shows the quality of these areas after being pushed 2 stops up to -6. If you want a "grunge HDR" look you would need to push much more and then the result would clearly be noisy.

Figure 8. Typically backlit scene shadow pushing scenorio, -8 stop shadowed areas push up two stops to -6. From left to right: Aptus 75, 5Dmk2 (bottom) and D7000.
We are then in the range when the D7000 has the advantage. I have myself used the 5D mark 2 for landscape photography, and I would use multiple exposures in backlit scenes because the noise is otherwise a little bit too high for my taste which can be seen in figure 8.

I would say that the noise in the D7000 is acceptable, and also on the Aptus 75 but it is closer to the limit (typically there are structures in a real picture that reduces the impact of the random noise). The D7000 does not have a wide margin either though, so you may still want to do multiple exposures or use gradient filters.

Even if the sensor noise would be zero the darkest parts of an image would be noisy due to that the signal itself is noisy -- few photons means high shot noise. This may be the limit for the D7000. To solve this problem the sensor needs higher full well capacity so longer shutter speeds and can be used and more photons be captured. This would put the larger medium format sensor at an advantage, but as we see in this test that advantage does not compensate for its higher sensor noise.

Summary of observations

Figure 9. Summary of observations presented in a scale of color checkers from the original test files in their original brightness. Approximate light level from saturation shown below the images.
Here is a summary of observations discussed above, with some additional support by looking through the whole set of test images and comparing side by side on a calibrated screen:
  • Down to about 4.5 stops from saturation the cameras are hard to differ from eachother, all of them have low noise.
  • Compared to the others the 5Dmk2 has evident noise problems in darker areas.
  • The Aptus and D7000 is relatively close, the D7000 has an advantage due to lower noise from the chip while the Aptus has an advantage due to higher full well capacity and thus lower photon shot noise.
  • Both the Aptus and the D7000 has random noise (except very close to the noise floor), while the 5Dmk2 has an evident pattern noise which makes the noise more disturbing and reduces subjective dynamic range further.
  • The Aptus is very slightly better than the D7000 down to about 7 stops from saturation (max 1/3 stop better if even that), then read/dark noise becomes a factor and the D7000 wins with about 2/3 stops towards the noise floor.
  • If scaled for resolution the Aptus performs better related to the D7000 but is still not as good in the darkest shadows.
  • The 5Dmk2 does not perform that bad in the brighter areas, probably even better than the D7000 if close to saturation, but from about 6-7 stops from saturation noise becomes a real problem, especially due to the pattern. In the darkest shadows its noise is roughly 2 stops higher than the D7000.
  • If an image is not brightened in post-processing both Aptus and D7000 has controlled noise down to almost pitch black, while 5Dmk2 noise becomes uncontrolled just before, which makes noise visible in 100% crops on a calibrated screen but rarely in a print.
  • For the use case of brightening shadows in an underexposed backlit scene the D7000 and Aptus have a significant advantage to the 5Dmk2 because the common case is that it can be done with acceptable results with the D7000/Aptus but not with the 5Dmk2. The D7000 is a little better than the Aptus in this use case but not significantly so.

Discussion

The Aptus Dalsa sensor is 48x36mm in size, while the D7000 Sony Exmor sensor is 24x16mm that is only 22% of the Aptus. Despite that noise performance in the top 7 stops is very similar and towards the noise floor it clearly exceeds the Aptus - that is truly impressive! On the other hand the Aptus is 5 year older than the D7000 and shows that even old medium format digital backs have very good noise performance. As discussed in the introduction the D7000 100% crops performs almost exactly the same as crops from the new full-frame D800.

That the "old" Leaf Aptus 75 fares so well is not a surprising result. The pace of CCD development has been slower, and the same sensor as used in Aptus 75 is still used in current backs today. This cannot be said about CMOS sensors used in DSLRs, back at the time of Aptus 75 was launched the CMOS noise performance at base ISO was not that good. This has changed though with the recent Sony Exmor sensors.

The Canon sensors are nowadays well-known for not having that great noise performance at base ISO, and we can clearly see that. The 5D mark 2 is far behind the other two. It is no surprise that past "MF vs DSLR" tests when Canon cameras were used showed a huge dynamic range advantage of the medium format sensors.

What do these noise levels mean in practice?

So how low noise levels do you actually need? Assuming you expose well it will depend on what you do in post-processing. If you do not push shadows, but rather apply a standard contrast-increasing S-curve which will darken shadows further then 5D mark 2 is adequate. So for many use cases one can argue that you don't really need more range than the 5D mark 2 provides.

In the backlit scene scenario using the Aptus or D7000 may however provide a significant change over the 5D mark 2 - if you needed multiple exposure or gradient filters before you can possibly drop them with the other two cameras, D7000 having the greatest advantage. However, noise is not completely invisible and exposures out in the field are not always perfectly ETTR, so if you are a perfectionist you probably will want to use gradient filters or multiple exposures for all these cameras.

To truly make gradient filters obsolete I would say that you need the noise performance of the D7000 pixels plus two more stops up to saturation, so you can push with less photon shot noise.

So where is the medium format advantage?

In these tests I cannot say that medium format shows any particular strong advantage over the 4.5 times smaller APS-C sensor, except resolution of course. When compared to the best DSLR sensors the "superior dynamic range" claim is no longer true when looking at 100% crops. Clearly a D800 which has D7000 pixels and the resolution of the Aptus 75 would be the stronger overall performer in terms of noise.

I did observe a lower noise level of the Aptus at better lit areas, thanks to lower photon shot noise. This should lead to finer tonal transitions which also is often claimed, but the difference in this case would be very small and hard to detect.

Still the claims of superior dynamic range has not disappeared. A quite common claim is that medium format has better ability to recover highlights, "more highlight dynamic range". This is probably due to differently tuned auto-exposure in the cameras and differently tuned histograms tools, that is DSLRs tend to expose brighter (closer to clipping) than MF systems. But when you expose manually and learn how the histograms work this difference disappears.

The most likely advantage to actually exist and be significant (except resolution!) is color rendition, but this test doesn't look into that. The theory goes that the MF CCDs have color filters tuned for best color at the expense of high ISO performance (which will be bad anyway due to the CCD), while the DSLR CMOSes have filters better suited to keep good sensitivity at high ISO. I do not know if this is true and if it significantly affects performance, but I have seen side-by-side tests where skin color was clearly more accurate out-of-the-box for medium format. It is an interesting aspect that could be investigated further.

It is also clear that when you work with short depth of field the lenses used on the medium format systems will produce a different look than DSLR lenses, and that look may be important to you. The effect is probably strongest at mid-range depth of fields (short but not the shortest possible) and maybe this is what produces the enhanced "3D look" that some claim to see. Just like the color rendition aspect this is something that needs further investigation.

Exposing in practice

Fine-tuning the exposure to just under clipping through raw diagnostics as done in this test is of course not practical in the field. So how far from optimal will these cameras expose if you use the tools at hand?

For manual still-life and landscape photography the "highlight blinkies" is the typical tool used, apart from the histogram. Overexpose slightly, re-adjust until highlights stop blinking. The most exact implementation would only start to blink if any raw channel gets clipped. However, the Canon and Nikon cameras analyze the embedded JPEG preview. The Leaf looks at the raw data, but checks only luminance.

In a high contrast scene with daylight and white highlights the blinking starts about 1/3 stop before optimal ETTR for all the tested cameras. This is good. In low contrast scenes the Canon/Nikon may push the jpeg and thus cause underexposure, but that does not matter much since it is a low contrast scene.

The worst case is with colored highlights which through highlight reconstruction may be darkened to fit the sRGB or AdobeRGB color space and may not even blink at all in the JPEG preview despite clipped raw data. This can happen on the Canon, unfortunately I did not have time to test that on the Nikon. Clipping may be seen in one of the individual RGB histograms though. To reduce these errors it is better to set the camera to the larger AdobeRGB color space (closer to what the camera can capture in raw) which then is used in the JPEG previews.

The Leaf Aptus raw-based histogram is much more reliable, but unfortunately the blinkies is only luminance-based. A technique that can be used on this back is to start with a slight under-exposure, point at the highlight in the preview and get a readout of exposure, and adjust the shutter speed to something that puts it just under clipping. It should be noted that on technical cameras with mechanical copal shutters there are only full stop shutter speeds, if one wants to adjust in 1/3 stops the aperture must be used.

Long exposures

The noise properties change greatly when exposing for tens of seconds. This is not tested here, but perhaps in a future test. Worth noting is that the Aptus is limited to 30 second exposures, since the sensor has too much dark noise (thermal noise) to allow longer. Result at 30 seconds should generally be good though if hot pixels are cleaned.

Appendix

DxOMark dynamic range vs photographic useful dynamic range

The dynamic range measurement on DxOMark is a straight-forward traditional scientific signal-to-noise measurement which shows how many stops down from saturation one can go before noise reaches equal level as the signal (S/N ratio = 1). On modern camera sensors this is somewhere between 11 to 14 stops. Many thus believe that modern cameras have as much as 14 stops of useful dynamic range. However, the stops get noisier and noiser the farther from saturation it is down to the last stop where the S/N ratio is 1 and then the image of course looks absolutely horrible if it would be brightened. So this measurement does not really say how many stops that are useful to the photographer (which I have tried to answer to some extent in this test here).

Photographic useful dynamic range is subjective and is considerably less than the scientifically measured. It is not as bad as one may think though because the noise appears where signal is weak that is in the shadows, and since shadows are dark noise is more acceptable there. How much noise that can be accepted will also be highly dependent on the actual scene, noise in a large single-color patch will be more disturbing than a section of a distant forest with lots of branches and other "random" small details.

The nature of the noise is also important, some cameras produce distinctive patterns in the noise (like the tested Canon 5D mark 2) while others have almost completely random noise. Random noise is much less disturbing to the eye.

As an example we can see how the Aptus 75 (75s on DxOMark), D7000 and 5D mark 2 compare pixel per pixel in the DxOMark dynamic range diagram (set to "Screen" mode to get pixel comparison). Nikon D7000 has 13.35 stops, Aptus 75s 11.43 and 5D mark 2 11.16 stops. Rounding a bit one could say that per pixel D7000 is 2 stops better than Aptus 75s which is 1/3 stop better than 5D mark 2. These are high quality scientific measurements of S/N ratio so there is no doubt that they are correct.

However, when S/N approaches 1 at the bottom stops the noise level is so high it is hard for the eye to differ (or care) between differences. Pattern noise also affects the subjective impact. The result of the visual test made here is Nikon D7000 2/3 stops better in dark shadows (considerable less than 2 stops!) and 5D mark 2 is about 1 1/3 stop worse than Aptus 75 (considerable more than 1/3 stops). This shows that an S/N=1 dynamic range measurement is not a good indication of actual performance in practice.

DxOMark also presents a measurement of S/N ratio for 18% gray, that is about 3 stops down from saturation. In the test done here this is the reference middle light level in the ETTR exposure and we can conclude that cameras of any decent quality are then impossible to differ, and indeed DxOMark shows also only very small differences, generally an advantage to cameras with better full well capacity (less shot noise). Only when throwing in a small compact camera (finger nail size sensors) one can see a significant difference at this light level.

When looking only at measurements for one camera (not in comparison mode), DxOMark provides "Full SNR" a complete graph of signal to noise ratio at all gray levels. This way you can see how noisy a sensor is for any suitable light level you choose. Since this is not available in comparison mode few DxOMark users use it or even know about it though.

At -8 stops (0.39%) the Aptus 75s gets 18.9 dB, D7000 20.8 dB and 5Dmk2 18 dB, that is about 2/3 stop advantage for the D7000 to the Aptus and yet 1/3 down to the 5Dmk2. This is similar to what I see here visually, but since an SNR graph does not take pattern noise into account the 5Dmk2 gets a full stop closer than in my test.

To sum up I would say that DxOMark does not provide any really good measurement for practical dynamic range when in comparison mode, the dynamic range diagram digs too deep into the noise, and the SNR 18% is too bright light level to show any significant differences. The full SNR graph available when viewing a single camera provides more information, but must also be used with care since it will not take patterned noise, blotchiness or other noise quality aspects into account.

Common mistakes in DR testing

Dynamic range tests MF vs DSLR etc similar to my test here has been done several times before but often contain errors. These are common ones:
  • In-camera histograms, raw converter histograms, auto-exposure or similar is used to find optimal ETTR exposure rather than using proper raw diagnostics. The result is slightly under- or overexposed files which may put one system at an unfair advantage.
  • Raw converter exposure slider are thought to work linearly, that is +1 stop on the slider means +1 stop in the file, but very few raw converters are linear due to highlight reconstruction, default contrast increase and other features. The result is wrong estimations of available dynamic range.
All current sensors record light linearly up to saturation, that is there is no built-in highlight and shadow compression like in for example slide film. Some users testing dynamic range with Lightroom (and many other raw converters) by moving the exposure slider start to believe there is non-linear behavior of the sensors, but it is the software that is non-linear - do not trust the exposure slider in raw converters unless you know exactly how it is implemented. Usually highlight reconstruction/compression code introduce a non-linear behavior.

If the scene contains the sun or other very strong highlight one have to clip some, but to simplify let us assume we have a normally bright highlight which we do not need to clip, as in the test scene. To expose optimally we want to gather as much light as possible and thus put that highlight as close to clipping as possible - "expose to the right", that is the histogram should be as far right as possible without clipping.

Problem is that almost all cameras have slightly misleading histograms, Canon and Nikon cameras show the histograms for the embedded JPEG preview, not the raw data. The Aptus used in this test has raw histograms (as far as I know) but show highlight clipping before actual clipping occurs. In practical use the histograms are generally ok, but for exact exposure for testing one will have to analyze the RAW file with proper software and find optimal exposure through trial and error.

Highlight clipping caveats

Most sensors have three color channels - red, green and blue. Half of the pixels are green, 1/4 red and 1/4 blue, and to form the final picture which needs complete RGB values for each pixel the colors are interpolated through a demosaicing algorithm.

Typically the green pixels are most sensitive and clip first when a highlight is overexposed. The property that one channel at a time is clipped means that discolourings can occur in clipped areas, which raw converter highlight reconstruction algorithms try to hide, and succeed in most cases.

Another aspect is that to get proper white balance the RGB channels are re-balanced, which may mean that clipping can be introduced in the raw converter in a near-saturated highlight despite that it is not clipped in the raw file.

Due to demosaicing, highlight reconstruction and white balancing it is in most software hard to actually see if a highlight is clipped or not if only one channel is clipped in the raw file.

Raw conversion for diagnostics

Available for free and on many platforms is the dcraw RAW converter. This can be used to make a no-fuzz raw conversion to find out if an area clips or not.

    dcraw -v -r 1 1 1 1 -o 0 -H 0 -T -W -g 1 1 file
this will give you an 8 bit TIF file which is demosaiced but not color balanced, no gamma curve and no highlight reconstruction. Clipped channels will be at 255, or for some camera models a lower value for example 246 - dcraw does unfortunately not have the correct clipping level for all models. Clip an area for sure and test that file to see what clipping value is at. The files will look greenish (no white balancing) and dark (no gamma curve).

To also skip demosaicing you can run dcraw in "document" mode, like this:

    dcraw -v -d -o 0 -H 0 -T -W -g 1 1 file
Another useful software for diagnostics is RawTherapee which is a full-featured raw converter like Adobe Lightroom, but unlike most raw converters highlight reconstruction etc can be turned off so a completely linear behavior of the exposure slider can be achieved. There is also good control of color space to deal with color space clippings.



Revision history


(c) Copyright 2012 - Anders Torger.