Memorability of Images

Russell Foltz-Smith
7 min readJul 31, 2023

--

Recently I ran across a nifty article, paper and code base about the memorability of images.

Well, obviously, as an artist obsessed with over analyzing everything with technology or as a software writer obsessed with turning everything into an artistic philosophy discussion… I dug right in.

I’ve long thought art, memory, perception are far simpler phenomena than we probably want to accept. This is not a statement meant to trivialize art or perception, but more to ask, what if it’s all just some phenomena of “relative interestingness”. What if perception is noticing something interesting and remembering it is because it’s different enough from what was already remembered?

So my response to the aforementioned paper “Memory for artwork is predictable” was… “well, of course it is”.

My hypothesis:

Human perception and, thus, memory, depends on a balance between order and chaos. For something to be perceptible it must differ from other things in the environment and differ from the apparatus qualities of perceiving system.

Camouflage in nature is a good example of the preceptive issue of differences or sameness of a thing and an environment:

An example of the apparatus qualities of a perceiving system is human vision. Human vision has a finite response rate, practically speaking “30 to 60 frames per second”. So if a video flashing by at 300 fps most of the frames won’t be actively perceived.

I’ve elsewhere explored how order and chaos commingle all the time to bounce through perceptibility and even existence itself.

Interestingness and perceptibility is a contingent situation.

Consider the following demonstration of randomness interacting with randomness and how “order” and “entropy” dance around. You can see that structure emerges from randomness with only simple logical operations. This, to me, is the critical point of perception and memory. Our perception and our environments and our universe is full of systems interacting and then bursting into more order (things we already understand) or into less order (things we do not yet know/understand).

This will warm you up perceptively for the exploration of my hypothesis and the code from the paper.

So onto my hypothesis of what’s going on in the paper with the concept that predicting memorability of artworks/visual signals is predictable…

I think it probably comes down to whether an image has a good balance of order and chaos and causes our perception to wobble about considering the image more, and being able to return to the image in our minds and in the world and finding still more to look at.

Images that are simple regular patterns or flat planes do not do this. We can ignore them because we’ve seen them before, they are just like so many other images. Similarly images that are mostly random or noise are ignorable. If there’s no discernible pattern then there’s nothing to notice or attach to nor remember.

Of course, images that provoke other associated responses in humans are also part of this equation. There are various shapes and color configurations that provide that extra bit of “pattern”. Molly Bang has probably the best exposition of this I’ve ever found. Really though, this is basically the same idea as before — order and chaos must be present to provoke and stoke and store — make us remember.

Thankfully images with various features are very easy to make with computers nowadays. I can thusly test my hypothesis with the code from the paper rather easily. Of course you might find that the paper and its code leave a lot of questions but I found the paper and its methods to be pretty sound and seem to corroborate a whole bunch of other gestalt theory, behavioral science, perception studies, information theory and even art theories. At the very least, it’s a another great measurement tool.

I thought to test my hypothesis that memorability is basically a function of the balance between order and chaos and that the deep learning network the researchers trained basically encodes that information by looking at these few facets:

  • Images with a good balance of order and chaos in basic “compositional pattern” would be more memorable than simple patterns or mostly randomness.
  • Images that use a good balance of color order and chaos would be more memorable.
  • Compression of an image is a good proxy for memorability.

On With The Experimentation!

Notes:

When I refer to “network” below I am talking about the Resmem deep learning neural network from the paper that provides a measure of memorability.

My experimentation apparatus includes Wolfram Language 13.3 with LLMs, OpenAI Dalle, Elementary Cellular Automata and my own artistic knowledge taught to me by Shelley Corwin, Glenn Vilppu, and several other art teachers over the year.

Images with Balance between Order and Chaos

Thankfully Wolfram gives us a great library of images that have these features: the elementary cellular automata, when visualized as pixel grids have a full range of compositional complexity.

Wolfram Chatbook Makes this easy for us

And sure enough… there seems to be a pretty obvious correlation between memorability and order and chaos. In the image below 4 different elementary CAs representing the 4 complexity classes are considered.

  • Rule 2 is a simple line. not very memorable.
  • Rule 30 is a lovely triangle of randomly distributed little triangles.more memorable.
  • Rule 110 is strange triangle with mostly regular little triangles. a little more memorable.
  • Oh, wait, Rule 254 is a giant black triangle… and is the most memorable of these.

Molly Bang must be at play here. That bold triangle must trigger something primal in us humans! there’s something interesting in that triangle.

If I add a little bit more randomness to each Rule considered above the correlation starts to make more sense.

Rule 110 with a random initial condition becomes very memorable.

In the paper images around .9 or above are considered highly memorable!

So now it becomes very easy to generate memorable images:

Color as a Basic Feature of Memorability

  • There’s clearly a lot going on here with color.
  • I suspect color theory will come into play with various complementary palettes, and various lights/darks patterns
  • An all black image is quite forgettable, while an all white one has information in the network.
  • Brighter hues seem to have more memorability.

What about compression as a proxy for memorability? is that what the paper’s network is encoding?

When I compress some of the images of CAs I made there is some obvious correlation between compression and memorability. The more you can compress an image the less memorable it is.

All of the eCA images ByteCount is ~2,500,000 uncompressed, compressed they go from 3700 to 70,000.

  • Rule 30, the 4th image above doesn’t compress that well, compared to the others but isn’t the most memorable.

When I test the images from the paper itself I find the correlation of compression to memorability roughly holds. It’s a reliable feature. Makes sense, as compression is basically “what can be forgotten/ignored in an image and doesn’t distort it too much”.

The image of all the memorable images is about 50% compressible. The memorable images themselves only have about 30% compressibility.

As one more exploration along these lines involves mostly random images and highly confusing camouflage images:

  • random image is pretty forgettable. or tending that way.
  • whereas the frog on the tree is pretty memorable.

Quite awesomely the network picks up the frog as the key feature! When I remove the background the network scores the same.

What about different subject matters?

It’s fairly easy with image generators to systematically prod the memorability network for features. Above is a study of figure count and colors.

If I remove color from the situation where do we learn:

  • a Single Figure is more memorable?

What about me?

If I ask AI to use my online identities as prompts do I get memorable images?

  • NOPE! hahahaha. These are the least memorable images of all I studied above. This is depressing. maybe this analysis should be over?

Naw! This network gives us a very reliable lever to pull to make memorable images!

Then again… maybe it really is as simple as having just enough interesting stuff going on … but not too much!

--

--

No responses yet