Digital Images - Computerphile

Digital Images – Computerphile



so unlike a normal photograph additional
image is made up of pixels small individual locations of certain color or
a certain level of greyscale intensity and restore my memory as a basically a
very long list and along with other information about the width and the
height of the image we can then access those pixels and determine what color
they are and what we can do other things like applying filters or compile them
into some other video or something like that so to keep it simple to begin with for
demonstration will talk only about uncompressed images already loaded into
memory so we'll have been a specific file format light like a GIF file or BMP
file we'll talk about that some other time
we'll talk just about uncompressed images in memory how they are stored and
how we use them usually a memory is a contiguous block so it's one long line
and so it's very helpful to us to represent two-dimensional images as
actually a very long line of of data so we usually start with an image is some
kind of header and that will tell us what the image file format is how wide
it is how tall it is and if there's any other information like exif data walking
a camera calibration data that sort of thing will be included in there and then
we essentially have a very very long list of pixels so we start here and this
point here will be our first row of our image and we will have a pixel here we
call picture 1 and then we'll have another pixel here pixel – and how long each of these
pictures in memory will depend on the type of image we're looking at so if we doing let's say a 2 by 2 image
then our image will look a bit like this when it's finished picture 1 2 3 & 4 so this is our image it's two pixels
high and two pixels wide and so in actual memory we have our header and
then we have the first picture on a second pixel that's right one and then we might have
some padding data that we won't worry too much about that depends on the file
format and then we'll just go on straight on to our second row so we have
pixel three and pixel for and that is essentially our image stored in memory
and then because we know how wide and how high the images we can index these
directly so we can say that if this is our stride which is 1 plus 2 plus P then
we can go one stride along to get to the next row and then two strikes long to
get to the next row and so on and we can the image like that so that's what the
image looks like on a very basic level each of these pixels represent some
amount of memory how much that is depends on the type of image that we're
looking at so if it's a grayscale image generally speaking there be less than
were used for me if it's an RGB image RGB images are by far the most common
most you know most images that we capture RGB oh geez red green yellow red green and
blue and they represent the primary colors that we detect in our eyes so
that's that's why it's helpful to think of more than a couple of extra
properties of our marriage that we look at is the bit depth and that is how many
bits how many thoughts and ones represent each individual element of
color or gray and a number of channels per pixel so that in an RGB image that
might be three or more before his an RGB alpha image is what absolute
transparency so a pixel will have a number of channels so let's say see in
this case is 3 so for r g and b i'm in a bit depth is usually eight you can get
bit depths ranging from one which was just an order one set of pixels I've
also off up to 60 may be 32 that's very high just like with normal
binary encoding the more bits you use per pixel and / color channel the more information you can hold so 8
is a maximum level of 255 for a bite and so in this case we have free
channels r g and b each of which can be somewhere from north to 255 what with
those numbers represents a 0 will be what black yet so we'll be none of that
color tool and 255 will be the most of that color but the camera seen baby mine
but some people posting would have taken place because I so things like this and
then so only do when you increase in a bit death is giving more different includes
images relaxing yes that's exactly why it's unlikely that you would use it to
show even blighted because usually 255 would mean as as red as you could get let's say so you would just have a fine
of range of colors in between the most general purpose use a bit depth of eight
is is perfectly adequate because you've got three different color channels doing
that so that's that's perfectly ample so and i'm a common file format would be an
eight bits per pixel grayscale image so if you go to image processing package
and you take your color image and convert it to grayscale what it usually
does is some averaging of the three color channels and then a much more memory efficient way of
storing that would be to represent it is great so in that case we have our header
information and then we literally have pics of one which will just be a
grayscale value from not to 255 so we'll have a bite here which pics a one and A
by here which is pixel – and by here which is pixel three and
each of those only takes up one bite rather than three or four for a normal
RGB image and that's why RGB image is generally much larger so alpha is very
common when you're doing image editing because it might be useful
for sort of combination of layers above the layers and things like this it's obviously not very common in normal
photographs because camera , transparency that wouldn't make much
sense but in general 32-bit images that is four channels per pixel is very
common even when we're not using the Alpha so you might find that your
digital camera produces a 32-bit per pixel 4-channel image even though he
doesn't actually output transparency and that alpha is just held as a padding
bite and that way we can let we can indirectly to our pixels in integer
terms it makes it much more not much simpler to do the mathematics
of getting to a certain pixel and doing something with it if this is our header and this is our
color image here then what in fact p 1 is is a low of RGB and then met possibly
an alpha channel possibly something that doesn't do anything so we'd have the red here the green blue
and this X here which may be an alpha channel may not be in each of these in
AP image will be eight bits long so that's one bite so this is eight bits
long this is eight to date and this is a here and then some the total size of
this pixel is 32 bits and that's what 32-bit images now 32 happens to be a lot
of computer architecture slides with integer or at least if it isn't you can
get a 32-bit integer very easily and that allows us to jump to a specific
pixel somewhere in our image so the height of our image is useful for
knowing when we're going to go off the end of the images into some other memory
but in terms of indexing pixels we don't use it what we use is something called stride
which is the width of a row of an image bearing in mind any padding and that depends on the file format so so so
would you be fair to say that the highest energies have many stripes of
god that's exactly what yes but of course you're looking at a single block
of memory and if you're operating system isn't being careful you want to make
sure you don't go off the end if we know what our stride is here and that will be
some within bites of our image including any padding and then we know we have a
variable X and available why that tell us which picture we want then where we
want to go to is the very beginning of our average plus a certain number of
rows based on our y plus a certain number of pixels based on always so the
actual formula is the picture we want is why x astride plus X and that will take
us through a certain amount of loads of data and straight to the way we want and
then to the picture we want and so this formula we can use to jump
straight to the picture we want and they were using some slightly more advanced
programming and bit shifting we can obtain the actual RGB data out of that
integer and then we can do things to it we could average them to make it a
grayscale image or we could blur them or we could add an alpha channel if we were
doing some kind of more complicated image editing something like that I'll see you on i'm using image
manipulation to make these computers are videos all the time it's fine selecting a pixel in Photoshop
and decided to change my this is what's going on this is that he was going on
behind the scenes here so it will if you select an individual pixel little OBX my
location you will know how about images stored in a big row in memory and it
will index that location and alter the RGB values for you which makes it a lot
easier if we want to turn these into pixels all we need to do is look by the nearby
pixels that have the color were looking for an interplay that value so in this case we don't have a green
value here but we know what the screen value is and we know what this great
value ways

24 thoughts on “Digital Images – Computerphile”

  1. If I converted RGB image file to header files in c language and then can I do some operations on it?
    If yes than how?

  2. Technically, alpha is alpha πŸ™‚ It has no intrinsic meaning, but can be (and usually is) used for blending, so people assume it's for transparency. You often get an alpha channel whether you want it or not for optimal memory alignment.

  3. These videos are great, thanks, but imo you'd gain a lot by giving the guy asking the questions a microphone as well. Can't really hear him so lose some of the explanation.

  4. Sir could you please tell me how the images stores in memory actually in deep and how digital data of array stores in memory ?.
    Please Sir…

  5. Is it actually possible to 'read' an image in numbers, and how do programs interact with these numbers? Also, what sort of programming languages can do this and how?

  6. Why index an image as a 1d array? Why not a 2d matrix where perhaps the x, y point in the matrix corresponds to the cartesian location of the pixel in the image? Or perhaps keep the location as a 1d array, and use the second to index the channels? Is this not as efficient?

  7. The brighter than full scale colors are probably used when editing these videos in so called "video levels" range which has a small headroom to contain overbright values, and impossible colors like 'saturated white', where details can be recovered from with contrast/curves if needed.

  8. would love a video about different image compressions, and what their basic ideas are, also how some audio codecs work in principles would be cool

  9. what if we took bits from a file, like a music file, then we create a picture with it? how it would look like?

  10. These are raster images there are also vector images witch work pretty much how you would construct a shape on a graph in maths. Though I'm not sure on the details of how these are stored on a computer. Vector images are very useful for something you might want to have resized often as they don't lose quality as you make them bigger.

  11. If every pixel is encoded in 3 channels (r/g/b), black would be every value of these channels 0 and white would be every channel on their max value…
    but what would be grey or "darker red" / "red mixed with white"?
    If I experiment in a Image editing software I usually could measure the r/g/b values of a pixel but also the "brightness" where even if red is on its max. value (like 255) if I set it's brightness to 0 it still would be black.

Leave a Reply

Your email address will not be published. Required fields are marked *