I love posts that peel back the abstraction layer of "images." It really highlights that modern photography is just signal processing with better marketing.
A fun tangent on the "green cast" mentioned in the post: the reason the Bayer pattern is RGGB (50% green) isn't just about color balance, but spatial resolution. The human eye is most sensitive to green light, so that channel effectively carries the majority of the luminance (brightness/detail) data.
In many advanced demosaicing algorithms, the pipeline actually reconstructs the green channel first to get a high-resolution luminance map, and then interpolates the red/blue signals—which act more like "color difference" layers—on top of it. We can get away with this because the human visual system is much more forgiving of low-resolution color data than it is of low-resolution brightness data. It’s the same psycho-visual principle that justifies 4:2:0 chroma subsampling in video compression.
Also, for anyone interested in how deep the rabbit hole goes, looking at the source code for dcraw (or libraw) is a rite of passage. It’s impressive how many edge cases exist just to interpret the "raw" voltages from different sensor manufacturers.
> A fun tangent on the "green cast" mentioned in the post: the reason the Bayer pattern is RGGB (50% green) isn't just about color balance, but spatial resolution. The human eye is most sensitive to green light, so that channel effectively carries the majority of the luminance (brightness/detail) data.
From the classic file format "ppm" (portable pixel map) the ppm to pgm (portable grayscale map) man page:
The quantization formula ppmtopgm uses is g = .299 r + .587 g + .114 b.
You'll note the relatively high value of green there, making up nearly 60% of the luminosity of the resulting grayscale image.
I also love the quote in there...
Quote
Cold-hearted orb that rules the night
Removes the colors from our sight
Red is gray, and yellow white
But we decide which is right
And which is a quantization error.
Funnily enough that's not the only mistake he made in that article. His final image is noticeably different from the camera's output image because he rescaled the values in the first step. That's why the dark areas look so crushed, eg around the firewood carrier on the lower left or around the cat, and similarly with highlights, e.g. the specular highlights on the ornaments.
After that, the next most important problem is the fact he operates in the wrong color space, where he's boosting raw RGB channels rather than luminance. That means that some objects appear much too saturated.
So his photo isn't "unprocessed", it's just incorrectly processed.
When I worked at Amazon on the Kindle Special Offers team (ads on your eink Kindle while it was sleeping), the first implementation of auto-generated ads was by someone who didn't know that properly converting RGB to grayscale was a smidge more complicated than just averaging the RGB channels. So for ~6 months in 2015ish, you may have seen a bunch of ads that looked pretty rough. I think I just needed to add a flag to the FFmpeg call to get it to convert RGB to luminance before mapping it to the 4-bit grayscale needed.
I don't think Kindle ads were available in my region in 2015 because I don't remember seeing these back then, but you're a lucky one to fix this classic mistake :-)
I remember trying out some of the home-made methods while I was implementing a creative work section for a school assignment. It’s surprising how "flat" the basic average looks until you actually respect the coefficients (usually some flavor of 0.21R + 0.72G + 0.07B). I bet it's even more apparent in a 4-bit display.
I remember using some photo editing software (Aperture I think) that would allow you to customize the different coefficients and there were even presets that give different names to different coefficients. Ultimately you can pick any coefficients you want, and only your eyes can judge how nice they are.
>Ultimately you can pick any coefficients you want, and only your eyes can judge how nice they are.
I went to a photoshop conference. There was a session on converting color to black and white. Basically at the end the presenter said you try a bunch of ways and pick the one that looks best.
(people there were really looking for the “one true way”)
I shot a lot of black and white film in college for our paper. One of my obsolete skills was thinking how an image would look in black and white while shooting, though I never understood the people who could look at a scene and decide to use a red filter..
Interesting that the "NTSC" look you describe is essentially rounded versions of the coefficients quoted in the comment mentioning ppm2pgm. I don't know the lineage of the values you used of course, but I found it interesting nonetheless. I imagine we'll never know, but it would be cool to be able to trace the path that lead to their formula, as well as the path to you arriving at yours
The NTSC color coefficients are the grandfather of all luminance coefficients.
It is necessary that it was precisely defined because of the requirements of backwards-compatible color transmission (YIQ is the common abbreviation for the NTSC color space, I being ~reddish and Q being ~blueish), basically they treated B&W (technically monochrome) pictures like how B&W film and videotubes treated them: great in green, average in red, and poorly in blue.
A bit unrelated: pre-color transition, the makeups used are actually slightly greenish too (which appears nicely in monochrome).
Cool. I could have been clearer in my post; as I understand it actual NTSC circuitry used different coefficients for RGBx and RGBy values, and I didn't take time to look up the official standard. My specific pondering was based on an assumption that neither the ppm2pgm formula nor the parent's "NTSC" formula were exact equivalents to NTSC, and my "ADHD" thoughts wondered about the provenance of how each poster came to use their respective approximations. While I write this, I realize that my actual ponderings are less interesting than the responses generated because of them, so thanks everyone for your insightful responses.
I was actually researching why PAL YUV has the same(-ish) coefficients, while forgetting that PAL is essentially a refinement of the NTSC color standard (PAL stands for phase-alternating line, which solves much of NTSC's color drift issues early in its life).
Exactly - film photographers heavily process(ed) their images from the film processing through to the print. Ansel Adams wrote a few books on the topic and they’re great reads.
And different films and photo papers can have totally different looks, defined by the chemistry of the manufacturer and however _they_ want things to look.
Excepting slide photos. No real adjustment once taken (a more difficult medium than negative film which you can adjust a little when printing)
You’re right about Ansel Adams. He “dodged and burned” extensively (lightened and darkened areas when printing.)
Photoshop kept the dodge and burn names on some tools for a while.
When we printed for our college paper we had a dial that could adjust the printed contrast a bit of our black and white “multigrade” paper (it added red light). People would mess with the processing to get different results too (cold/ sepia toned). It was hard to get exactly what you wanted and I kind of see why digital took over.
the bayer pattern is one of those things that makes me irrationally angry, in the true sense, based on my ignorance of the subject
what's so special about green? oh so just because our eyes are more sensitive to green we should dedicate double the area to green in camera sensors? i mean, probably yes. but still. (⩺_⩹)
Green is in the center of the visible spectrum of light (notice the G in the middle of ROYGBIV), so evolution should theoretically optimize for green light absorption. An interesting article on why plants typically reflect that wavelength and absorb the others: https://en.wikipedia.org/wiki/Purple_Earth_hypothesis
Green is the highest energy light emitted by our sun, from any part of the entire light spectrum, which is why green appears in the middle of the visible spectrum. The visible spectrum basically exists because we "grew up" with a sun that blasts that frequency range more than any other part of the light spectrum.
I have to wonder what our planet would look like if the spectrum shifts over time. Would plants also shift their reflected light? Would eyes subtly change across species? Of course, there would probably be larger issues at play around having a survivable environment … but still, fun to ponder.
Several reasons,
-Silicon efficiency (QE) peaks in the green
-Green spectral response curve is close to the luminance curve humans see, like you said.
-Twice the pixels to increase the effective resolution in the green/luminance channel, color channels in YUV contribute almost no details.
Why is YUV or other luminance-chrominance color spaces important for a RGB input? Because many processing steps and encoders, work in YUV colorspaces. This wasn't really covered in the article.
Not sure why it would invoke such strong sentiments but if you don’t like the bayer filter, know that some true monochrome cameras don’t use it and make every sensor pixel available to the final image.
For instance, the Leica M series have specific monochrome versions with huge resolutions and better monochrome rendering.
You can also modify some cameras and remove the filter, but the results usually need processing.
A side effect is that the now exposed sensor is more sensitive to both ends of the spectrum.
This is also why I absolute hate, hate, hate it when people ask me whether I "edited" a photo or whether a photo is "original", as if trying to explain away nice-looking images as if they are fake.
The JPEGs cameras produce are heavily processed, and they are emphatically NOT "original". Taking manual control of that process to produce an alternative JPEG with different curves, mappings, calibrations, is not a crime.
As a mostly amateur photographer, it doesn't bother me if people ask that question. While I understand the point that the camera itself may be making some 'editing' type decision on the data first, a) in theory each camera maker has attempted to calibrate the output to some standard, b) public would expect two photos taken at same time with same model camera should look identical. That differs greatly from what often can happen in "post production" editing - you'll never find two that are identical.
I wrote the raw Bayer to JPEG pipeline used by the phone I write this comment on. The choices on how to interpret the data are mine. Can I tweak these afterwards? :)
I mean it depends, does your Bayer-to-JPEG pipeline try to detect things like 'this is a zoomed in picture of the moon' and then do auto-fixup to put a perfect moon image there? That's why there's some need to differentiate between SOOC's now, because Samsung did that.
I know my Sony gear can't call out to AI because the WIFI sucks like every other Sony product and barely works inside my house, but also I know the first ILC manufacturer that tries to put AI right into RAW files is probably the first to leave part of the photography market.
That said I'm a purist to the point where I always offer RAWs for my work [0] and don't do any photoshop/etc. D/A, horizon, bright adjust/crop to taste.
Where phones can possibly do better is the smaller size and true MP structure of a cell phone camera sensor, makes it easier to handle things like motion blur. and rolling shutter.
But, I have yet to see anything that gets closer to an ILC for true quality than the decade+ old pureview cameras on Nokia cameras, probably partially because they often had sensors large enough.
There's only so much computation can do to simulate true physics.
[0] - I've found people -like- that. TBH, it helps that I tend to work cheap or for barter type jobs in that scene, however it winds up being something where I've gotten repeat work because they found me and a 'photoshop person' was cheaper than getting an AIO pro.
There's a difference between an unbiased (roughly speaking) pipeline and what (for example) JBIG2 did. The latter counts as "editing" and "fake" as far as I'm concerned. It may not be a crime but at least personally I think it's inherently dishonest to attempt to play such things off as "original".
And then there's all the nonsense BigTech enables out of the box today with automated AI touch ups. That definitely qualifies as fakery although the end result may be visually pleasing and some people might find it desirable.
it's not a crime but applying post processing in an overly generous way that goes a lot further than replicating what a human sees does take away from what makes pictures interesting imho vs other mediums, that it's a genuine representation of something that actually happened.
if you take that away, a picture is not very interesting, it's hyperrealistic so not super creative a lot of the time (compared to eg paintings), & it doesn't even require the mastery of other mediums to get hyperrealistism
Perhaps interestingly, many/most digital cameras are sensitive to IR and can record, for example, the LEDs of an infrared TV remote.
But they don't see it as IR. Instead, this infrared information just kind of irrevocably leaks into the RGB channels that we do perceive. With the unmodified camera on my Samsung phone, IR shows up kind of purple-ish. Which is... well... it's fake. Making invisible IR into visible purple is an artificially-produced artifact of the process that results in me being able to see things that are normally ~impossible for me to observe with my eyeballs.
When you generate your own "genuine" images using your digital camera(s), do you use an external IR filter? Or are you satisfied with knowing that the results are fake?
But the camera is trying to emulate how it would look if your eyes were seeing it. In order for it to be 'genuine' you would need not only the camera to genuine, but also the OS, the video driver, the viewing app, the display and the image format/compression. They all do things to the image that are not genuine.
Upon inspection, the author's personal website used em dashes in 2023. I hope this helped with your witch hunt.
I'm imagining a sort of Logan's Run-like scifi setup where only people with a documented em dash before November 30, 2022, i.e. D(ash)-day, are left with permission to write.
> I'm imagining a sort of Logan's Run-like scifi setup where only people with a documented em dash before November 30, 2022, i.e. D(ash)-day, are left with permission to write.
At least Robespierre needed two sentences before condemning a man. Now the mob is lynching people on the basis of a single glyph.
Phew. I have published work with em dashes, bulleted lists, “not just X, but Y” phrasing, and the use of “certainly”, all from the 90’s. Feel sorry for the kids, but I got mine.
I have been overusing em dashes and bulleted lists since the actual 80s, I'm sad to say. I spent much of the 90s manually typing "smart" quotes.
I have actually been deliberately modifying my long-time writing style and use of punctuation to look less like an LLM. I'm not sure how I feel about this.
Alt + 0151, baby! Or... however you do it on MacOS.
But now, likewise, having to bail on emdashes. My last differentiator is that I always close set the emdash—no spaces on either side, whereas ChatGPT typically opens them (AP Style).
found the guy who didn't know about em dashes before this year
also your question implies a bad assumption even if you disclaim it. if you don't want to imply a bad assumption the way to do that is to not say the words, not disclaim them
But does applying the same transfer function to each pixel (of a given colour anyway) count as "processing"?
What bothers me as an old-school photographer is this. When you really pushed it with film (e.g. overprocess 400ISO B&W film to 1600 ISO and even then maybe underexpose at the enlargement step) you got nasty grain. But that was uniform "noise" all over the picture. Nowadays, noise reduction is impressive, but at the cost of sometimes changing the picture. For example, the IP cameras I have, sometimes when I come home on the bike, part of the wheel is missing, having been deleted by the algorithm as it struggled with the "grainy" asphalt driveway underneath.
Smartphone and dedicated digital still cameras aren't as drastic, but when zoomed in, or in low light, faces have a "painted" kind of look. I'd prefer honest noise, or better yet an adjustable denoising algorithm from "none" (grainy but honest) to what is now the default.
I hear you. Two years ago I went to my dad's and I spent the afternoon "scanning" old pictures of my grandparents (his parents), dead almost two decades ago. I took pictures of the physical photos, situating the phone as horizontal as possible (parallel to the picture), so it was as similar as a scan (to avoid perspective, reflection, etc).
It was my fault that I didn't check the pictures while I was doing it. Imagine my dissapointment when I checked them back at home: the Android camera decided to apply some kind of AI filter to all the pictures. Now my grandparents don't look like them at all, they are just an AI version.
> For example, the IP cameras I have, sometimes when I come home on the bike, part of the wheel is missing, having been deleted by the algorithm as it struggled with the "grainy" asphalt driveway underneath.
Heavy denoising is necessary for cheap IP cameras because they use cheap sensors paired with high f-number optics. Since you have a photography background you'll understand the tradeoff that you'd have to make if you could only choose one lens and f-stop combination but you needed everything in every scene to be in focus.
You can get low-light IP cameras or manual focus cameras that do better.
The second factor is the video compression ratio. The more noise you let through, the higher bitrate needed to stream and archive the footage. Let too much noise through for a bitrate setting and the video codec will be ditching the noise for you, or you'll be swimming in macroblocks. There are IP cameras that let you turn up the bitrate and decrease the denoise setting like you want, but be prepared to watch your video storage times decrease dramatically as most of your bits go to storing that noise.
> Smartphone and dedicated digital still cameras aren't as drastic, but when zoomed in, or in low light, faces have a "painted" kind of look. I'd prefer honest noise, or better yet an adjustable denoising algorithm from "none" (grainy but honest) to what is now the default.
If you have an iPhone then getting a camera app like Halide and shooting in one of the RAW formats will let you do this and more. You can also choose Apple ProRAW on recent iPhone Pro models which is a little more processed, but still provides a large amount of raw image data to work with.
> does applying the same transfer function to each pixel (of a given colour anyway) count as “processing”?
This is interesting to think about, at least for us photo nerds. ;) I honestly think there are multiple right answers, but I have a specific one that I prefer. Applying the same transfer function to all pixels corresponds pretty tightly to film & paper exposure in analog photography. So one reasonable followup question is: did we count manually over- or under-exposing an analog photo to be manipulation or “processing”? Like you can’t see an image without exposing it, so even though there are timing & brightness recommendations for any given film or paper, generally speaking it’s not considered manipulation to expose it until it’s visible. Sometimes if we pushed or pulled to change the way something looks such that you see things that weren’t visible to the naked eye, then we call it manipulation, but generally people aren’t accused of “photoshopping” something just by raising or lowering the brightness a little, right?
When I started reading the article, my first thought was, ‘there’s no such thing as an unprocessed photo that you can see’. Sensor readings can’t be looked at without making choices about how to expose them, without choosing a mapping or transfer function. That’s not to mention that they come with physical response curves that the author went out of his way to sort-of remove. The first few dark images in there are a sort of unnatural way to view images, but in fact they are just as processed as the final image, they’re simply processed differently. You can’t avoid “processing” a digital image if you want to see it, right? Measuring light with sensors involves response curves, transcoding to an image format involves response curves, and displaying on monitor or paper involves response curves, so any image has been processed a bunch by the time we see it, right? Does that count as “processing”? Technically, I think exposure processing is always built-in, but that kinda means exposing an image is natural and not some type of manipulation that changes the image. Ultimately it depends on what we mean by “processing”.
Equally bad is the massive over sharpening applied to CCTV and dash cams. I tried to buy a dash cam a year ago that didn't have over sharpened images but it proved impossible.
Reading reg plates would be a lot easier if I could sharpen the image myself rather than try to battle with the "turn it up to 11" approach by manufacturers.
Was mentioning to my GF (non technical animator) about the submission Clock synchronization is a nightmare. And how it comes up like a bad penny. She said in animation you have the problem that you're animating to match different streams and you have to keep in sync. Bonus you have to dither because if you match too close the players can smell it's off.
Artist develops a camera that takes AI-generated images based on your location.
Paragraphica generates pictures based on the weather, date, and other information.
You may know that intermittent rashes are always invisible in the presence of medical credentials.
Years ago I became suspicious of my Samsung Android device when I couldn't produce a reliable likeness of an allergy induced rash. No matter how I lit things, the photos were always "nicer" than what my eyes recorded live.
The incentives here are clear enough - people will prefer a phone whose camera gives them an impression of better skin, especially when the applied differences are extremely subtle and don't scream airbrush. If brand-x were the only one to allow "real skin" into the gallery viewer, people and photos would soon be decried as showing 'x-skin', which would be considered gross. Heaven help you if you ever managed to get close to a mirror or another human.
To this day I do not know whether it was my imagination or whether some inline processing effectively does or did perform micro airbrushing on things like this.
Whatever did or does happen, the incentive is evergreen - media capture must flatter the expectations of its authors, without getting caught in its sycophancy. All the while, capacity improves steadily.
I studied remote sensing in undergrad and it really helped me grok sensors and signal processing. My favourite mental model revelation to come from it was that what I see isn’t the “ground truth.” It’s a view of a subset of the data. My eyes, my cat’s eyes, my cameras all collect and render different subsets of the data, providing different views of the subject matter.
It gets even wilder when perceiving space and time as additional signal dimensions.
I imagine a sort of absolute reality that is the universe. And we’re all just sensor systems observing tiny bits of it in different and often overlapping ways.
> My favourite mental model revelation to come from it was that what I see isn’t the “ground truth.” It’s a view of a subset of the data. My eyes, my cat’s eyes, my cameras all collect and render different subsets of the data, providing different views of the subject matter.
And not only that, our sensors can return spurious data, or even purposely constructed fake data, created with good or evil intent.
I've had this in mind at times in recent years due to $DAYJOB. We use simulation heavily to provide fake CPUs, hardware devices, what have you, with the goal of keeping our target software happy by convincing it that it's running in its native environment instead of on a developer laptop.
Just keep in mind that it's important not to go _too_ far down the rabbit hole, one can spend way too much time in "what if we're all just brains in jars?"-land.
I think everyone agrees that dynamic range compression and de-Bayering (for sensors which are colour-filtered) are necessary for digital photography, but at the other end of the spectrum is "use AI to recognise objects and hallucinate what they 'should' look like" --- and despite how everyone would probably say that isn't a real photo anymore, it seems manufacturers are pushing strongly in that direction, raising issues with things like admissibility of evidence.
One thing I've learned while dabbling in photography is that there are no "fake" images, because there are no "real" images. Everything is an interpretation of the data that the camera has to do, making a thousand choices along the way, as this post beautifully demonstrates.
A better discriminator might be global edits vs local edits, with local edits being things like retouching specific parts of the image to make desired changes, and one could argue that local edits are "more fake" than global edits, but it still depends on a thousand factors, most importantly intent.
"Fake" images are images with intent to deceive. By that definition, even an image that came straight out of the camera can be "fake" if it's showing something other than what it's purported to (e.g. a real photo of police violence but with a label saying it's in a different country is a fake photo).
What most people think when they say "fake", though, is a photo that has had filters applied, which makes zero sense. As the post shows, all photos have filters applied. We should get over that specific editing process, it's no more fake than anything else.
> What most people think when they say "fake", though, is a photo that has had filters applied, which makes zero sense. As the post shows, all photos have filters applied.
Filters themselves don't make it fake, just like words themselves don't make something a lie. How the filters and words are used, whether they bring us closer or further from some truth, is what makes the difference.
Photos implicitly convey, usually, 'this is what you would see if you were there'. Obviously filters can help with that, as in the OP, or hurt.
Pretending that "these two things are the same, actually" when in fact no, you can seperately name and describe them quite clearly, is a favorite pastime of vacuous content on the internet.
Artists, who use these tools with clear vision and intent to achieve specific goals, strangely never have this problem.
The ones that make the annual rounds up here in New England are those foliage photos with saturation jacked. “Look at how amazing it was!” They’re easy to spot since doing that usually wildly blows out the blues in the photo unless you know enough to selectively pull those back.
Often I find photos rather dull compared to what I recall. Unless the lighting is perfect it’s easy to end up with a poor image. On the other hand the images used in travel websites are laughably over processed.
Photography is also an art. When painters jack up saturations in their choices of paint colors people don't bat an eyelid. There's no good reason photographers cannot take that liberty as well, and tone mapping choices is in fact a big part of photographers' expressive medium.
If you want reality, go there in person and stop looking at photos. Viewing imagery is a fundamentally different type of experience.
Sure — but people reasonably distinguish between photos and digital art, with “photo” used to denote the intent to accurately convey rather than artistic expression.
We’ve had similar debates about art using miniatures and lens distortions versus photos since photography was invented — and digital editing fell on the lens trick and miniature side of the issue.
This is a longstanding debate in landscape photography communities - virtually everyone edits, but there’s real debate as to what the line is and what is too much. There does seem to be an idea of being faithful to the original experience, which I subscribe to, but that’s certainly not universal.
There are a whole lot of landscape photographs out there I can vouch for their realism 1% of the time because I do a lot of landscape photography myself and tend to get out at dawn and dusk a lot. There are lots of shots I got where the sky looked a certain way for a grand total of 2 minutes before sunrise, and I can see similar lighting in other peoples' shots as real.
A lot of armchair critics on the internet who only go out to their local park at high noon will say they look fake but they're not.
News agencies like AP have already come up with technical standards and guidelines to technically define 'acceptable' types and degrees of image processing applied to professional photo-journalism.
You can look it up because it's published on the web but IIRC it's generally what you'd expect. It's okay to do whole-image processing where all pixels have the same algorithm applied like the basic brightness, contrast, color, tint, gamma, levels, cropping, scaling, etc filters that have been standard for decades. The usual debayering and color space conversions are also fine. Selectively removing, adding or changing only some pixels or objects is generally not okay for journalistic purposes. Obviously, per-object AI enhancement of the type many mobile phones and social media apps apply by default don't meet such standards.
I think Samsung was doing what was alleged, but as somebody who was working on state of the art algorithms for camera processing at a competitor while this was happening, this experiment does not prove what is alleged. Gaussian blurring does not remove the information, you can deconvolve and it's possible that Samsung's pre-ML super resolution was essentially the same as inverting a gaussian convolution
And? What algorithm was used for downsampling? What was the high frequency content of the downsampled imagine after doing a psuedo inverse with upsampling? How closely does it match the Samsung output?
My point is that there IS an experiment which would show that Samsung is doing some nonstandard processing likely involving replacement. The evidence provided is insufficient to show that
You can upscale a 170x170 image yourself, if you're not familiar with what that looks like. The only high frequency details you have after upscaling are artifacts. This thing pulled real details out of nowhere.
You can try to guess the location of edges to enhance them after upscaling, but it's guessing, and when the source has the detail level of a 170x170 moon photo a big proportion of the guessing will inevitably be wrong.
And in this case it would take a pretty amazing unblur to even get to the point it can start looking for those edges.
I think if you paste our conversation into ChatGPT it can explain the relevant upsampling algorithms. There are algorithms that will artificially enhance edges in a way that can look like "AI", for example everything done on pixel phones prior to ~2023
And to be clear, everyone including Apple has been doing this since at least 2017
The problem with what Samsung was doing is that it was moon-specific detection and replacement
You have clearly made no attempts to read the original article which has a lot more evidence (or are actively avoiding it), and somehow seem to be defending Samsung voraciously but emptily, so you're not worth arguing with and I'll just leave this here:
I zoomed in on the monitor showing that image and, guess what, again you see slapped on detail, even in the parts I explicitly clipped (made completely 100% white):
> somehow seem to be defending Samsung voraciously but emptily
The first words I said were that Samsung probably did this
And you're right that I didn't read the dozens of edits which were added after the original post. I was basing my arguments off everything before the "conclusion section", which it seems the author understands was not actually conclusive.
I agree that the later experiments, particularly the "two moons" experiment were decisive.
Also to be clear, I know that Samsung was doing this, because as I said I worked at a competitor. At the time I did my own tests on Samsung devices because I was also working on moon related image quality
Eh, I'm a photographer and I don't fully agree. Of course almost all photos these days are edited in some form. Intent is important, yes. But there are still some kinds of edits that immediately classify a photo as "fake" for me.
For example if you add snow to a shot with masking or generative AI. It's fake because the real life experience was not actually snowing. You can't just hallucinate a major part of the image - that counts as fake to me. A major departure from the reality of the scene. Many other types of edits don't have this property because they are mostly based on the reality of what occurred.
I think for me this comes from an intrinsic valuing of the act/craft of photography, in the physical sense. Once an image is too digitally manipulated then it's less photography and more digital art.
> A better discriminator might be global edits vs local edits,
Even that isn't all that clear-cut. Is noise removal a local edit? It only touches some pixels, but obviously, that's a silly take.
Is automated dust removal still global? The same idea, just a bit more selective. If we let it slide, what about automated skin blemish removal? Depth map + relighting, de-hazing, or fake bokeh? I think that modern image processing techniques really blur the distinction here because many edits that would previously need to be done selectively by hand are now a "global" filter that's a single keypress away.
Intent is the defining factor, as you note, but intent is... often hazy. If you dial down the exposure to make the photo more dramatic / more sinister, you're manipulating emotions too. Yet, that kind of editing is perfectly OK in photojournalism. Adding or removing elements for dramatic effect? Not so much.
What's this, special pleading for doctored photos?
The only process in the article that involves nearby pixels is to combine R G and B (and other G) into one screen pixel. (In principle these could be mapped to subpixels.) Everything fancier than that can be reasonably called some fake cosmetic bullshit.
The article doesn't even go anywhere near what you need to do in order to get an acceptable output. It only shows the absolute basics. If you apply only those to a photo from a phone camera, it will be massively distorted (the effect is smaller, but still present on big cameras).
That's just one kind of distortion you'll see. There will also be bad pixels, lens shading, excessive noise in low light, various electrical differences across rows and temperatures that need to be compensated... Some (most?) sensors will even correct some of these for you already before handing you "raw" data.
Raw formats usually carry "Bayer-filtered linear (well, almost linear) light in device-specific color space", not necessarily "raw unprocessed readings from the sensor array", although some vendors move it slightly more towards the latter than others.
But when you shift the goal posts that far, a real image has never been produced. But people very clearly want to describe when an image has been modified to represent something that didn’t happen.
I understand what you and the article are saying, but what GP is getting at, and what I agree with, is that there is a difference between a photo that attempts to reproduce what the "average" human sees, and digital processing that augments the image in ways that no human could possibly visualize. Sometimes we create "fake" images to improve clarity, detail, etc., but that's still less "fake" than smoothing skin to remove blemishes, or removing background objects. One is clearly a closer approximation of how we perceive reality than the other.
So there are levels of image processing, and it would be wrong to dump them all in the same category.
An unprocessed photo does not “look”. It is RGGB pixel values that far exceed any display media in dynamic range. Fitting it into the tiny dynamic range of screens by thrusting
throwing away data strategically (inventing perceptual the neutral grey point, etc.) is what actually makes sense of them, and what is the creative task.
Correction is useful for a bunch of different reasons, not all of them related to monitors. Even ISP pipelines without displays involved will still usually do it to allocate more bits to the highlights/shadows than the relatively distinguishable middle bits. Old CRTs did it because the electron gun had a non-linear response and the gamma curve actually linearized the output. Film processing and logarithmic CMOS sensors do it because the sensing medium has a nonlinear sensitivity to the light level.
No. It's about the shape of the curve. Human light intensity perception is not linear. You have to nonlinearize at some point of the pipeline, but yes, typically you should use high-resolution (>=16 bits per channel) linear color in calculations and apply the gamma curve just before display. The fact that traditionally this was not done, and linear operations like blending were applied to nonlinear RGB values, resulted in ugly dark, muddy bands of intermediate colors even in high-end applications like Photoshop.
The shape of the curve doesn't matter at all. What matters is having a mismatch between the capture curve and the display curve.
If you kept it linear all the way to the output pixels, it would look fine. You only have to go nonlinear because the screen expects nonlinear data. The screen expects this because it saves a few bits, which is nice but far from necessary.
To put it another way, it appears so dark because it isn't being "displayed directly". It's going directly out to the monitor, and the chip inside the monitor is distorting it.
>Human light intensity perception is not linear... You have to nonlinearize at some point of the pipeline
Why exactly? My understanding is that gamma correction is effectively a optimization scheme during encoding to allocate bits in a perceptually uniform way across the dynamic range. But if you just have enough bits to work with and are not concerned with file sizes (and assuming all hardware could support these higher bit depths), then this shouldn't matter? IIRC unlike crts, LCDs don't have a power curve response in terms of the hardware anyway, and emulate the overall 2.2 trc via LUT. So you could certainly get monitors to accept linear input (assuming you manage to crank up the bit depth enough to the point where you're not losing perceptual fidelity), and just do everything in linear light.
In fact if you just encoded the linear values as floats that would probably give you best of both worlds, since floating point is basically log-encoding where density of floats is lower at the higher end of the range.
If we're talking about a sunset, then we're talking about your monitor shooting out blinding, eye-hurting brightness light wherever the sun is in the image. That wouldn't be very pleasant.
That's a matter of tone mapping which is separate from gamma encoding? Even today, linearized pixel value 255 will be displayed at your defined SDR brightness no matter what. Changing your encoding gamma won't help that because for correct output the transform necessarily needs to be be undone during display.
Which is why I'm looking at replacing my car's rear-view mirror with a camera and a monitor. Because I can hard-cap the monitor brightness and curve the brightness below that, eliminating the problem of billion-lumens headlights behind me.
Specifically Tim's quote "There's also this modern idea that art and technology must never meet - you know, you go to school for technology or you go to school for art, but never for both... And in the Golden Age, they were one and the same person."
Bob used to have some incredible articles on the science of photography that were linked from photo.net back when Philip Greenspun owned and operated it. A detailed explanation of digital sensor fundamentals (e.g. why bigger wells are inherently better) particularly sticks in my mind. They're still online (bookmarked now!)
I've always considered that Tim Jennison quote to be a reference to C.P. Snow's "The Two Cultures" lecture. Steve Jobs' ambition for Apple to be "where the Liberal Arts and Technology meet" also seemed similarly influenced. If you haven't read Snow's lecture, it's well worth the quick read.
Very interesting, pity the author chose such a poor example for the explanation (low, artificial and multicoloured light), making it really hard to understand what the "ground truth" and expected result should be.
I'm not sure I understand your complaint. The "expected result" is either of the last two images (depending on your preference), and one of the main points of the post is to challenge the notion of "ground truth" in the first place.
Not a complaint, but both the final images have poor contrast, lighting, saturation and colour balance, making them a disappointing target for an explanation of how these elements are produced from raw sensor data.
I work with camera sensors and I think this is a good way to train some of the new guys, with some added segments about the sensor itself and readout. It starts with raw data, something any engineer can understand, and the connection to the familiar output makes for good training.
This is a great write up. It's also weirdly similar to a video I happened upon yesterday playing around with raw Hubble imagery: https://www.youtube.com/watch?v=1gBXSQCWdSI
He take a few minutes to get to the punch line. Feel free to skip ahead to around 5:30.
I spent a good part of my career, working in image processing.
That first image is pretty much exactly what a raw Bayer format looks like, without any color information. I find it gets even more interesting, if we add the RGB colors, and use non-square pixels.
In its most raw form, camera sensors only see illumination not color.
In front of the sensor is a bayer filter which results in each physical pixel seeing illumination filtered R G or B.
From there the software onboard the camera or in your RAW converter does interpolation to create RGB values at each pixel. For example if the local pixel is R filtered, it then interpolates its G & B values from nearby pixels of that filter.
This is also why Leica B&W sensor cameras have higher apparently sharpness & ISO sensitivity than the related color sensor models because there is no filter in front or software interpolation happening.
That's how the earliest color photography worked. "Making color separations by reloading the camera and changing the filter between exposures was inconvenient", notes Wikipedia.
I think they are both more asking about 'per pixel color filters'; that is, something like a sensor filter/glass but the color separators could change (at least 'per-line') fast enough to get a proper readout of the color in formation.
AKA imagine a camera with R/G/B filters being quickly rotated out for 3 exposures, then imagine it again but the technology is integrated right into the sensor (and, ideally, the sensor and switching mechanism is fast enough to read out with rolling shutter competitive with modern ILCs)
Works for static images, but if there's motion the "changing the filters" part is never fast enough, there will always be colour fringing somewhere.
Edit or maybe it does work? I've watched at least one movie on a DLP type video projector with sequential colour and not noticed colour fringing. But still photos have much higher demand here.
You can use sets of exotic mirrors and/or prisms to split incoming images into separate RGB beams into three independent monochrome sensors, through the same singular lens and all at once. That's what "3CCD" cameras and their predecessors did.
The sensor outputs a single value per pixel. A later processing step is needed to interpret that data given knowledge about the color filter (usually Bayer pattern) in front of the sensor.
The raw sensor output is a single value per sensor pixel, each of which is behind a red, green, or blue color filter. So to get a usable image (where each pixel has a value for all three colors), we have to somehow condense the values from some number of these sensor pixels. This is the "Debayering" process.
And this is important because our perception is more sensitive to luminance changes than color, and with our eyes being most sensitive to green, luminance is also. So, higher perceived spatial resolution by using more green [1]. This is also why JPG has lower resolution red and green channels, and why modern OLED usually use a pentile display, with only green being at full resolutio [2].
Pentile displays are acceptable for photos and videos, but look really horrible displaying text and fine detail --- which looks almost like what you'd see on an old triad-shadow-mask colour CRT.
I love the look of the final product after the manual work (not the one for comparison). Just something very realistic and wholesome about it, not pumped to 10 via AI or Instagram filters.
> There’s nothing that happens when you adjust the contrast or white balance in editing software that the camera hasn’t done under the hood. The edited image isn’t “faker” then the original: they are different renditions of the same data.
Almost, but not quite? The camera works with more data than what's present in the JPG your image editing software sees.
For those who are curious, this is basically what we do when we color grade in video production but taken to its most extreme. Or rather, stripped down to the most fundamental level. Lots of ways to describe it.
Generally we shoot “flat” (there are so many caveats to this but I don’t feel like getting bogged down in all of it. If you plan on getting down and dirty with colors and really grading, you generally shoot flat). The image that we handover to DIT/editing can be borderline grayscale in its appearance. The colors are so muted, the dynamic range is so wide, that you basically have a highly muted image. The reason for this is you then have the freedom to “push” the color and look and almost any direction, versus if you have a very saturated, high contrast image, you are more “locked” into that look. This matters more and more when you are using a compressed codec and not something with an incredibly high bitrate or raw codecs, which is a whole other world and I am also doing a bit of a disservice to by oversimplifying.
Though this being HN it is incredibly likely I am telling few to no people anything new here lol
"Flat" is a bit of a misnomer in this context. It's not flat, it's actually a logarithmic ("log profile") representation of data computed by the camera to allow a wider dynamic range to be squeezed into traditional video formats.
It's sort of the opposite of what's going on with photography, where you have a dedicated "raw" format with linear readings from the sensor. Without these formats, someone would probably have invented "log JPEG" or something like that to preserve more data in highlights and in the shadows.
I said “flat” because I didn’t feel like going into “log” and color profiles and such but I’ll admit I’m leaning hard into over-simplification, because log, raw, etc. gets messy when discussing profiles vs codecs/compression/etc. In video we still call some codecs “raw,” but it’s not the same necessarily as how it’s used in photography. Like the Red raw codec has various compression ratios (5:1 tends to be the sweet spot IME) and it really messes with the whole idea of what raw even is. It’s all quasi-technical and somewhat inconsistent.
Honestly, I think the gamma normalization step don't really count as "processing", any more than the gzip decompression step doesn't count as "processing" for the purposes of "this is what an unprocessed html file looks like" demo. At the end of the day, it's the same information, but encoded differently. Similar arguments can be made for de-bayer filter step. If you ignore these two steps, the "processing" that happens looks far less dramatic.
I wanted to say I that I think it's overrated in terms of its position on HN, but rather than criticize side issues of it, which often point to something being a weak article in general, I probably should have just said exactly what I don't like about it as a whole. So I'll do that.
I think the headline is problematic because it suggests the raw photos aren't very good and thus need processing, however the raw data isn't something the camera makers intend to be put forth as a photo, and the data is intended to be processed right from the start. The data of course can be presented in as images but that serves as visualizations of the data rather than the source image or photo. Wikipedia does it a lot more justice. https://en.wikipedia.org/wiki/Raw_image_format If articles like OP's catch on, camera makers might be incentivized to game the sensors so their output makes more sense to the general public, and that would be inefficient, so the proper context should be given, which this "unprocessed photo" article doesn't do in my opinion.
There are also no citations, and it has this phrase "This website is not licensed for ML/LLM training or content creation." Yeah right, that's like the privacy notice posts people make to facebook from time to time that contradict the terms of service https://knowyourmeme.com/memes/facebook-privacy-notices
I've been studying machine learning during the xmas break, and as an exercise I started tinkering around with the raw Bayer data from my Nikon camera, throwing it at various architectures to see what I can squeeze out of the sensor.
Something that surprised me is that very little of the computation photography magic that has been developed for mobile phones has been applied to larger DSLRs. Perhaps it's because it's not as desperately needed, or because prior to the current AI madness nobody had sufficient GPU power lying around for such a purpose.
For example, it's a relatively straightforward exercise to feed in "dark" and "flat" frames as extra per-pixel embeddings, which lets the model learn about the specifics of each individual sensor and its associated amplifier. In principle, this could allow not only better denoising, but also stretch the dynamic range a tiny bit by leveraging the less sensitive photosites in highlights and the more senstive ones in the dark areas.
Similarly, few if any photo editing products do simultaneous debayering and denoising, most do the latter as a step in normal RGB space.
Not to mention multi-frame stacking that compensates for camera motion, etc...
The whole area is "untapped" for full-frame cameras, someone just needs to throw a few server grade GPUs at the problem for a while!
This stuff exists and it's fairly well-studied. It's surprisingly hard to find without coming across it in literature though, the universe of image processing is huge. Joint demosaicing, for example, is a decades-old technique [0] fairly common in astrophotography. Commercial photographers simply never cared or asked for it, and so the tools intended for them didn't bother either. You'd find more of it in things like scientific ISP and robotics.
I trawled through much of the research but as you’ve mentioned it seems to be known only in astrophotography and mobile devices or other similarly constrained hardware.
A fun tangent on the "green cast" mentioned in the post: the reason the Bayer pattern is RGGB (50% green) isn't just about color balance, but spatial resolution. The human eye is most sensitive to green light, so that channel effectively carries the majority of the luminance (brightness/detail) data. In many advanced demosaicing algorithms, the pipeline actually reconstructs the green channel first to get a high-resolution luminance map, and then interpolates the red/blue signals—which act more like "color difference" layers—on top of it. We can get away with this because the human visual system is much more forgiving of low-resolution color data than it is of low-resolution brightness data. It’s the same psycho-visual principle that justifies 4:2:0 chroma subsampling in video compression.
Also, for anyone interested in how deep the rabbit hole goes, looking at the source code for dcraw (or libraw) is a rite of passage. It’s impressive how many edge cases exist just to interpret the "raw" voltages from different sensor manufacturers.
From the classic file format "ppm" (portable pixel map) the ppm to pgm (portable grayscale map) man page:
https://linux.die.net/man/1/ppmtopgm
You'll note the relatively high value of green there, making up nearly 60% of the luminosity of the resulting grayscale image.I also love the quote in there...
(context for the original - https://www.youtube.com/watch?v=VNC54BKv3mc )After that, the next most important problem is the fact he operates in the wrong color space, where he's boosting raw RGB channels rather than luminance. That means that some objects appear much too saturated.
So his photo isn't "unprocessed", it's just incorrectly processed.
When I worked at Amazon on the Kindle Special Offers team (ads on your eink Kindle while it was sleeping), the first implementation of auto-generated ads was by someone who didn't know that properly converting RGB to grayscale was a smidge more complicated than just averaging the RGB channels. So for ~6 months in 2015ish, you may have seen a bunch of ads that looked pretty rough. I think I just needed to add a flag to the FFmpeg call to get it to convert RGB to luminance before mapping it to the 4-bit grayscale needed.
I remember trying out some of the home-made methods while I was implementing a creative work section for a school assignment. It’s surprising how "flat" the basic average looks until you actually respect the coefficients (usually some flavor of 0.21R + 0.72G + 0.07B). I bet it's even more apparent in a 4-bit display.
I went to a photoshop conference. There was a session on converting color to black and white. Basically at the end the presenter said you try a bunch of ways and pick the one that looks best.
(people there were really looking for the “one true way”)
I shot a lot of black and white film in college for our paper. One of my obsolete skills was thinking how an image would look in black and white while shooting, though I never understood the people who could look at a scene and decide to use a red filter..
This is the coefficients I use regularly.
It is necessary that it was precisely defined because of the requirements of backwards-compatible color transmission (YIQ is the common abbreviation for the NTSC color space, I being ~reddish and Q being ~blueish), basically they treated B&W (technically monochrome) pictures like how B&W film and videotubes treated them: great in green, average in red, and poorly in blue.
A bit unrelated: pre-color transition, the makeups used are actually slightly greenish too (which appears nicely in monochrome).
Page 5 has:
The last equation are those coefficients.Like q3_sqrt
There is no such thing as “unprocessed” data, at least that we can perceive.
And different films and photo papers can have totally different looks, defined by the chemistry of the manufacturer and however _they_ want things to look.
You’re right about Ansel Adams. He “dodged and burned” extensively (lightened and darkened areas when printing.) Photoshop kept the dodge and burn names on some tools for a while.
https://m.youtube.com/watch?v=IoCtni-WWVs
When we printed for our college paper we had a dial that could adjust the printed contrast a bit of our black and white “multigrade” paper (it added red light). People would mess with the processing to get different results too (cold/ sepia toned). It was hard to get exactly what you wanted and I kind of see why digital took over.
what's so special about green? oh so just because our eyes are more sensitive to green we should dedicate double the area to green in camera sensors? i mean, probably yes. but still. (⩺_⩹)
Why is YUV or other luminance-chrominance color spaces important for a RGB input? Because many processing steps and encoders, work in YUV colorspaces. This wasn't really covered in the article.
For instance, the Leica M series have specific monochrome versions with huge resolutions and better monochrome rendering.
You can also modify some cameras and remove the filter, but the results usually need processing. A side effect is that the now exposed sensor is more sensitive to both ends of the spectrum.
The JPEGs cameras produce are heavily processed, and they are emphatically NOT "original". Taking manual control of that process to produce an alternative JPEG with different curves, mappings, calibrations, is not a crime.
I know my Sony gear can't call out to AI because the WIFI sucks like every other Sony product and barely works inside my house, but also I know the first ILC manufacturer that tries to put AI right into RAW files is probably the first to leave part of the photography market.
That said I'm a purist to the point where I always offer RAWs for my work [0] and don't do any photoshop/etc. D/A, horizon, bright adjust/crop to taste.
Where phones can possibly do better is the smaller size and true MP structure of a cell phone camera sensor, makes it easier to handle things like motion blur. and rolling shutter.
But, I have yet to see anything that gets closer to an ILC for true quality than the decade+ old pureview cameras on Nokia cameras, probably partially because they often had sensors large enough.
There's only so much computation can do to simulate true physics.
[0] - I've found people -like- that. TBH, it helps that I tend to work cheap or for barter type jobs in that scene, however it winds up being something where I've gotten repeat work because they found me and a 'photoshop person' was cheaper than getting an AIO pro.
And then there's all the nonsense BigTech enables out of the box today with automated AI touch ups. That definitely qualifies as fakery although the end result may be visually pleasing and some people might find it desirable.
if you take that away, a picture is not very interesting, it's hyperrealistic so not super creative a lot of the time (compared to eg paintings), & it doesn't even require the mastery of other mediums to get hyperrealistism
I can't see infrared.
But they don't see it as IR. Instead, this infrared information just kind of irrevocably leaks into the RGB channels that we do perceive. With the unmodified camera on my Samsung phone, IR shows up kind of purple-ish. Which is... well... it's fake. Making invisible IR into visible purple is an artificially-produced artifact of the process that results in me being able to see things that are normally ~impossible for me to observe with my eyeballs.
When you generate your own "genuine" images using your digital camera(s), do you use an external IR filter? Or are you satisfied with knowing that the results are fake?
this is totally out of my own self-interest, no problems with its content
I'm imagining a sort of Logan's Run-like scifi setup where only people with a documented em dash before November 30, 2022, i.e. D(ash)-day, are left with permission to write.
At least Robespierre needed two sentences before condemning a man. Now the mob is lynching people on the basis of a single glyph.
I have actually been deliberately modifying my long-time writing style and use of punctuation to look less like an LLM. I'm not sure how I feel about this.
But now, likewise, having to bail on emdashes. My last differentiator is that I always close set the emdash—no spaces on either side, whereas ChatGPT typically opens them (AP Style).
Russians use this for at least 15 years
https://ilyabirman.ru/typography-layout/
also your question implies a bad assumption even if you disclaim it. if you don't want to imply a bad assumption the way to do that is to not say the words, not disclaim them
“NO EM DASHES” is common system prompt behavior.
What bothers me as an old-school photographer is this. When you really pushed it with film (e.g. overprocess 400ISO B&W film to 1600 ISO and even then maybe underexpose at the enlargement step) you got nasty grain. But that was uniform "noise" all over the picture. Nowadays, noise reduction is impressive, but at the cost of sometimes changing the picture. For example, the IP cameras I have, sometimes when I come home on the bike, part of the wheel is missing, having been deleted by the algorithm as it struggled with the "grainy" asphalt driveway underneath.
Smartphone and dedicated digital still cameras aren't as drastic, but when zoomed in, or in low light, faces have a "painted" kind of look. I'd prefer honest noise, or better yet an adjustable denoising algorithm from "none" (grainy but honest) to what is now the default.
It was my fault that I didn't check the pictures while I was doing it. Imagine my dissapointment when I checked them back at home: the Android camera decided to apply some kind of AI filter to all the pictures. Now my grandparents don't look like them at all, they are just an AI version.
Heavy denoising is necessary for cheap IP cameras because they use cheap sensors paired with high f-number optics. Since you have a photography background you'll understand the tradeoff that you'd have to make if you could only choose one lens and f-stop combination but you needed everything in every scene to be in focus.
You can get low-light IP cameras or manual focus cameras that do better.
The second factor is the video compression ratio. The more noise you let through, the higher bitrate needed to stream and archive the footage. Let too much noise through for a bitrate setting and the video codec will be ditching the noise for you, or you'll be swimming in macroblocks. There are IP cameras that let you turn up the bitrate and decrease the denoise setting like you want, but be prepared to watch your video storage times decrease dramatically as most of your bits go to storing that noise.
> Smartphone and dedicated digital still cameras aren't as drastic, but when zoomed in, or in low light, faces have a "painted" kind of look. I'd prefer honest noise, or better yet an adjustable denoising algorithm from "none" (grainy but honest) to what is now the default.
If you have an iPhone then getting a camera app like Halide and shooting in one of the RAW formats will let you do this and more. You can also choose Apple ProRAW on recent iPhone Pro models which is a little more processed, but still provides a large amount of raw image data to work with.
This is interesting to think about, at least for us photo nerds. ;) I honestly think there are multiple right answers, but I have a specific one that I prefer. Applying the same transfer function to all pixels corresponds pretty tightly to film & paper exposure in analog photography. So one reasonable followup question is: did we count manually over- or under-exposing an analog photo to be manipulation or “processing”? Like you can’t see an image without exposing it, so even though there are timing & brightness recommendations for any given film or paper, generally speaking it’s not considered manipulation to expose it until it’s visible. Sometimes if we pushed or pulled to change the way something looks such that you see things that weren’t visible to the naked eye, then we call it manipulation, but generally people aren’t accused of “photoshopping” something just by raising or lowering the brightness a little, right?
When I started reading the article, my first thought was, ‘there’s no such thing as an unprocessed photo that you can see’. Sensor readings can’t be looked at without making choices about how to expose them, without choosing a mapping or transfer function. That’s not to mention that they come with physical response curves that the author went out of his way to sort-of remove. The first few dark images in there are a sort of unnatural way to view images, but in fact they are just as processed as the final image, they’re simply processed differently. You can’t avoid “processing” a digital image if you want to see it, right? Measuring light with sensors involves response curves, transcoding to an image format involves response curves, and displaying on monitor or paper involves response curves, so any image has been processed a bunch by the time we see it, right? Does that count as “processing”? Technically, I think exposure processing is always built-in, but that kinda means exposing an image is natural and not some type of manipulation that changes the image. Ultimately it depends on what we mean by “processing”.
Reading reg plates would be a lot easier if I could sharpen the image myself rather than try to battle with the "turn it up to 11" approach by manufacturers.
Noise is part of the world itself.
Artist develops a camera that takes AI-generated images based on your location. Paragraphica generates pictures based on the weather, date, and other information.
https://www.standard.co.uk/news/tech/ai-camera-images-paragr...
Years ago I became suspicious of my Samsung Android device when I couldn't produce a reliable likeness of an allergy induced rash. No matter how I lit things, the photos were always "nicer" than what my eyes recorded live.
The incentives here are clear enough - people will prefer a phone whose camera gives them an impression of better skin, especially when the applied differences are extremely subtle and don't scream airbrush. If brand-x were the only one to allow "real skin" into the gallery viewer, people and photos would soon be decried as showing 'x-skin', which would be considered gross. Heaven help you if you ever managed to get close to a mirror or another human.
To this day I do not know whether it was my imagination or whether some inline processing effectively does or did perform micro airbrushing on things like this.
Whatever did or does happen, the incentive is evergreen - media capture must flatter the expectations of its authors, without getting caught in its sycophancy. All the while, capacity improves steadily.
It gets even wilder when perceiving space and time as additional signal dimensions.
I imagine a sort of absolute reality that is the universe. And we’re all just sensor systems observing tiny bits of it in different and often overlapping ways.
What a nice way to put it.
I've had this in mind at times in recent years due to $DAYJOB. We use simulation heavily to provide fake CPUs, hardware devices, what have you, with the goal of keeping our target software happy by convincing it that it's running in its native environment instead of on a developer laptop.
Just keep in mind that it's important not to go _too_ far down the rabbit hole, one can spend way too much time in "what if we're all just brains in jars?"-land.
A better discriminator might be global edits vs local edits, with local edits being things like retouching specific parts of the image to make desired changes, and one could argue that local edits are "more fake" than global edits, but it still depends on a thousand factors, most importantly intent.
"Fake" images are images with intent to deceive. By that definition, even an image that came straight out of the camera can be "fake" if it's showing something other than what it's purported to (e.g. a real photo of police violence but with a label saying it's in a different country is a fake photo).
What most people think when they say "fake", though, is a photo that has had filters applied, which makes zero sense. As the post shows, all photos have filters applied. We should get over that specific editing process, it's no more fake than anything else.
Filters themselves don't make it fake, just like words themselves don't make something a lie. How the filters and words are used, whether they bring us closer or further from some truth, is what makes the difference.
Photos implicitly convey, usually, 'this is what you would see if you were there'. Obviously filters can help with that, as in the OP, or hurt.
Artists, who use these tools with clear vision and intent to achieve specific goals, strangely never have this problem.
The ones that make the annual rounds up here in New England are those foliage photos with saturation jacked. “Look at how amazing it was!” They’re easy to spot since doing that usually wildly blows out the blues in the photo unless you know enough to selectively pull those back.
If you want reality, go there in person and stop looking at photos. Viewing imagery is a fundamentally different type of experience.
We’ve had similar debates about art using miniatures and lens distortions versus photos since photography was invented — and digital editing fell on the lens trick and miniature side of the issue.
Portrait photography -- no, people don't look like that in real life with skin flaws edited out
Landscape photography -- no, the landscapes don't look like that 99% of the time, the photographer picks the 1% of the time when it looks surreal
Staged photography -- no, it didn't really happen
Street photography -- a lot of it is staged spontaneously
Product photography -- no, they don't look like that in normal lighting
A lot of armchair critics on the internet who only go out to their local park at high noon will say they look fake but they're not.
What about this? https://news.ycombinator.com/item?id=35107601
You can look it up because it's published on the web but IIRC it's generally what you'd expect. It's okay to do whole-image processing where all pixels have the same algorithm applied like the basic brightness, contrast, color, tint, gamma, levels, cropping, scaling, etc filters that have been standard for decades. The usual debayering and color space conversions are also fine. Selectively removing, adding or changing only some pixels or objects is generally not okay for journalistic purposes. Obviously, per-object AI enhancement of the type many mobile phones and social media apps apply by default don't meet such standards.
I downsized it to 170x170 pixels
My point is that there IS an experiment which would show that Samsung is doing some nonstandard processing likely involving replacement. The evidence provided is insufficient to show that
For example see
https://en.wikipedia.org/wiki/Edge_enhancement
You can try to guess the location of edges to enhance them after upscaling, but it's guessing, and when the source has the detail level of a 170x170 moon photo a big proportion of the guessing will inevitably be wrong.
And in this case it would take a pretty amazing unblur to even get to the point it can start looking for those edges.
You did not link an example of upscaling, the before and after are the same size.
Unsharp filters enhance false edges on almost all images.
If you claim either one of those are wrong, you're being ridiculous.
And to be clear, everyone including Apple has been doing this since at least 2017
The problem with what Samsung was doing is that it was moon-specific detection and replacement
I zoomed in on the monitor showing that image and, guess what, again you see slapped on detail, even in the parts I explicitly clipped (made completely 100% white):
The first words I said were that Samsung probably did this
And you're right that I didn't read the dozens of edits which were added after the original post. I was basing my arguments off everything before the "conclusion section", which it seems the author understands was not actually conclusive.
I agree that the later experiments, particularly the "two moons" experiment were decisive.
Also to be clear, I know that Samsung was doing this, because as I said I worked at a competitor. At the time I did my own tests on Samsung devices because I was also working on moon related image quality
i.e. Camera+Lens+ISO+SS+FStop+FL+TC (If present)+Filter (If present). Add focus distance if being super duper proper.
And some of that is to help at least provide the right requirements to try to recreate.
For example if you add snow to a shot with masking or generative AI. It's fake because the real life experience was not actually snowing. You can't just hallucinate a major part of the image - that counts as fake to me. A major departure from the reality of the scene. Many other types of edits don't have this property because they are mostly based on the reality of what occurred.
I think for me this comes from an intrinsic valuing of the act/craft of photography, in the physical sense. Once an image is too digitally manipulated then it's less photography and more digital art.
Even that isn't all that clear-cut. Is noise removal a local edit? It only touches some pixels, but obviously, that's a silly take.
Is automated dust removal still global? The same idea, just a bit more selective. If we let it slide, what about automated skin blemish removal? Depth map + relighting, de-hazing, or fake bokeh? I think that modern image processing techniques really blur the distinction here because many edits that would previously need to be done selectively by hand are now a "global" filter that's a single keypress away.
Intent is the defining factor, as you note, but intent is... often hazy. If you dial down the exposure to make the photo more dramatic / more sinister, you're manipulating emotions too. Yet, that kind of editing is perfectly OK in photojournalism. Adding or removing elements for dramatic effect? Not so much.
The only process in the article that involves nearby pixels is to combine R G and B (and other G) into one screen pixel. (In principle these could be mapped to subpixels.) Everything fancier than that can be reasonably called some fake cosmetic bullshit.
Raw formats usually carry "Bayer-filtered linear (well, almost linear) light in device-specific color space", not necessarily "raw unprocessed readings from the sensor array", although some vendors move it slightly more towards the latter than others.
Removing dust and blemishes entails looking at more than one pixel at a time.
Nothing in the basic processing described in the article does that.
So there are levels of image processing, and it would be wrong to dump them all in the same category.
This seems more a limitation of monitors. If you had very large bit depth, couldn't you just display images in linear light without gamma correction.
If you kept it linear all the way to the output pixels, it would look fine. You only have to go nonlinear because the screen expects nonlinear data. The screen expects this because it saves a few bits, which is nice but far from necessary.
To put it another way, it appears so dark because it isn't being "displayed directly". It's going directly out to the monitor, and the chip inside the monitor is distorting it.
Why exactly? My understanding is that gamma correction is effectively a optimization scheme during encoding to allocate bits in a perceptually uniform way across the dynamic range. But if you just have enough bits to work with and are not concerned with file sizes (and assuming all hardware could support these higher bit depths), then this shouldn't matter? IIRC unlike crts, LCDs don't have a power curve response in terms of the hardware anyway, and emulate the overall 2.2 trc via LUT. So you could certainly get monitors to accept linear input (assuming you manage to crank up the bit depth enough to the point where you're not losing perceptual fidelity), and just do everything in linear light.
In fact if you just encoded the linear values as floats that would probably give you best of both worlds, since floating point is basically log-encoding where density of floats is lower at the higher end of the range.
https://www.scantips.com/lights/gamma2.html (I don't agree with a lot of the claims there, but it has a nice calculator)
== Tim's Vermeer ==
Specifically Tim's quote "There's also this modern idea that art and technology must never meet - you know, you go to school for technology or you go to school for art, but never for both... And in the Golden Age, they were one and the same person."
https://en.wikipedia.org/wiki/Tim%27s_Vermeer
https://www.imdb.com/title/tt3089388/quotes/?item=qt2312040
== John Lind's The Science of Photography ==
Best explanation I ever read on the science of photography https://johnlind.tripod.com/science/scienceframe.html
== Bob Atkins ==
Bob used to have some incredible articles on the science of photography that were linked from photo.net back when Philip Greenspun owned and operated it. A detailed explanation of digital sensor fundamentals (e.g. why bigger wells are inherently better) particularly sticks in my mind. They're still online (bookmarked now!)
https://www.bobatkins.com/photography/digital/size_matters.h...
But anyway, I enjoyed the article.
I’ve been staring at 16-bit HDR greyscale space for so long…
Then -> than? (In case the author is reading comments here.)
He take a few minutes to get to the punch line. Feel free to skip ahead to around 5:30.
I spent a good part of my career, working in image processing.
That first image is pretty much exactly what a raw Bayer format looks like, without any color information. I find it gets even more interesting, if we add the RGB colors, and use non-square pixels.
Is the output produced by the sensor RGB or a single value per pixel?
In front of the sensor is a bayer filter which results in each physical pixel seeing illumination filtered R G or B.
From there the software onboard the camera or in your RAW converter does interpolation to create RGB values at each pixel. For example if the local pixel is R filtered, it then interpolates its G & B values from nearby pixels of that filter.
https://en.wikipedia.org/wiki/Bayer_filter
There are alternatives such as what Fuji does with its X-trans sensor filter.
https://en.wikipedia.org/wiki/Fujifilm_X-Trans_sensor
Another alternative is Foveon (owned by Sigma now) which makes full color pixel sensors but they have not kept up with state of the art.
https://en.wikipedia.org/wiki/Foveon_X3_sensor
This is also why Leica B&W sensor cameras have higher apparently sharpness & ISO sensitivity than the related color sensor models because there is no filter in front or software interpolation happening.
https://en.wikipedia.org/wiki/Pixel_shift
EDIT: Sigma also has "Foveon" sensors that do not have the filter and instead stacks multiple sensors (for different wavelengths) at each pixel.
https://en.wikipedia.org/wiki/Foveon_X3_sensor
Works great. Most astro shots are taken using a monochrome sensor and filter wheel.
> filters are something like quantum dots that can be turned on/off
If anyone has this tech, plz let me know! Maybe an etalon?
https://en.wikipedia.org/wiki/Fabry%E2%80%93P%C3%A9rot_inter...
I have no idea, it was my first thought when I thought of modern color filters.
AKA imagine a camera with R/G/B filters being quickly rotated out for 3 exposures, then imagine it again but the technology is integrated right into the sensor (and, ideally, the sensor and switching mechanism is fast enough to read out with rolling shutter competitive with modern ILCs)
Edit or maybe it does work? I've watched at least one movie on a DLP type video projector with sequential colour and not noticed colour fringing. But still photos have much higher demand here.
Each RGB pixel would be 2x2 grid of
``` G R B G ```
So G appears twice as many as other colors (this is mostly the same for both the screen and sensor technology).
There are different ways to do the color filter layouts for screens and sensors (Fuji X-Trans have different layout, for example).
[1] https://en.wikipedia.org/wiki/Bayer_filter#Explanation
[2] https://en.wikipedia.org/wiki/PenTile_matrix_family
Almost, but not quite? The camera works with more data than what's present in the JPG your image editing software sees.
Generally we shoot “flat” (there are so many caveats to this but I don’t feel like getting bogged down in all of it. If you plan on getting down and dirty with colors and really grading, you generally shoot flat). The image that we handover to DIT/editing can be borderline grayscale in its appearance. The colors are so muted, the dynamic range is so wide, that you basically have a highly muted image. The reason for this is you then have the freedom to “push” the color and look and almost any direction, versus if you have a very saturated, high contrast image, you are more “locked” into that look. This matters more and more when you are using a compressed codec and not something with an incredibly high bitrate or raw codecs, which is a whole other world and I am also doing a bit of a disservice to by oversimplifying.
Though this being HN it is incredibly likely I am telling few to no people anything new here lol
It's sort of the opposite of what's going on with photography, where you have a dedicated "raw" format with linear readings from the sensor. Without these formats, someone would probably have invented "log JPEG" or something like that to preserve more data in highlights and in the shadows.
[0] - https://en.wikipedia.org/wiki/Super_CCD#/media/File:Fuji_CCD...
Processing these does seem like more fun though.
https://en.wikipedia.org/wiki/Analog-to-digital_converter
I think the headline is problematic because it suggests the raw photos aren't very good and thus need processing, however the raw data isn't something the camera makers intend to be put forth as a photo, and the data is intended to be processed right from the start. The data of course can be presented in as images but that serves as visualizations of the data rather than the source image or photo. Wikipedia does it a lot more justice. https://en.wikipedia.org/wiki/Raw_image_format If articles like OP's catch on, camera makers might be incentivized to game the sensors so their output makes more sense to the general public, and that would be inefficient, so the proper context should be given, which this "unprocessed photo" article doesn't do in my opinion.
Something that surprised me is that very little of the computation photography magic that has been developed for mobile phones has been applied to larger DSLRs. Perhaps it's because it's not as desperately needed, or because prior to the current AI madness nobody had sufficient GPU power lying around for such a purpose.
For example, it's a relatively straightforward exercise to feed in "dark" and "flat" frames as extra per-pixel embeddings, which lets the model learn about the specifics of each individual sensor and its associated amplifier. In principle, this could allow not only better denoising, but also stretch the dynamic range a tiny bit by leveraging the less sensitive photosites in highlights and the more senstive ones in the dark areas.
Similarly, few if any photo editing products do simultaneous debayering and denoising, most do the latter as a step in normal RGB space.
Not to mention multi-frame stacking that compensates for camera motion, etc...
The whole area is "untapped" for full-frame cameras, someone just needs to throw a few server grade GPUs at the problem for a while!
[0] https://doi.org/10.1145/2980179.2982399