ChatGPT And MidJourney And Why Most Don't Need To Fear The Bot Revolution
The machine intelligence that will be make us outmoded is not here – not yet, anyway.
We’ve crossed these technological thresholds before. And the outcomes were nothing like what was predicted, whether it was the creation of the typewriter, the availability of small and affordable motion picture cameras from Europe (viz. Arriflex and Bolex), affordable HD prosumer digital video cameras or digital musical instruments, or whether it was desktop software allowing uncoordinated people to create computer art.
Each time, pundits and purists declared the technology would make humankind the ‘servant’ and not the ‘master’. That it would dehumanize us and destroy the human touch, the human factor. That never happened. Not one time.
But these technological inflection points did have an impact and follow a pattern. They invited charlatans, changed the win distributions to favor the very best in a given field, and generally elevated everybody’s game.
The world got a shock recently when both ChatGPT and MidJourney demonstrated the effective use of a diffusion model and natural language prompting (commands) to get bots to, when prompted, return an article, image, or even a book – that made sense, and which, while not a Rembrandt or Hemingway, was kinda’ sorta’ okay. The image had people with 6 fingers, and the text had no specific insights – but in a pinch, it’s good for creating a photo of a phony boyfriend, or an 11th-grade paper on Vasco Da Gama.
These technologies – MidJourney and ChatGPT – are not AI, or Artificial Intelligence, which means – in technical circles – general purpose intelligence or intelligence with agency, and the ability to contemplate and direct itself, inside a body or not.
No, you’re forgiven if you got spooked, but these weren’t even Machine Learning, which involves training data (successive feedback and correction). ChatGPT and MidJourney use pre-programmed instruction. Data sets and instructions are there from the get-go.
ChatGPT works through its Generative Pre-trained Transformer, which uses specialized algorithms to find patterns within data sequences.
Pretty cool stuff, even if it famously isn’t thinking, doesn’t know things, and therefore can’t get the correct number of fingers on a human hand correct. Yet.
I won’t bury the lead here: text-to-image generation will likely do a few things:
- Raise the bar for the creative disciplines it touches – directly and indirectly. Currently, this includes digital images, graphics and photos – mostly on the web. Eventually, it will include print and even motion graphics and video. Everybody’s work – to remain competitive – will have to improve.
- It will place the determination of who is the best, creatively, in the province of industry professionals, not the public, which will struggle to discern the best from just good.
- It will shift the distribution of market wins (money, attention) away from just ‘average’ to the very best in any creative discipline, and
- It will create jobs for prompt engineers, who are doing what artists would do with their hands, with their minds.
And ChatGPT will do even less than that. It will produce beautiful garbage. It will make you exactly not a thought leader, which is where all the wins are in the Web 2.0 world, if you have not been paying attention. It can synthesize and plagiarize with the best of them, but it will always be derivative, downstream. Because these technologies cannot think. They are as useful as the hands (mind) using them is capable.
This work always has and will continue to will necessitate some level of creativity and knowledge of art styles, genres and history – in addition to dexterity and physical coordination.
“The typewriter didn’t create more Ernest Hemingways.” ~ George Lucas
And the same thing happened with the digitization or programmability of music. MIDI, quantizing; and to some extent, it happened with AutoTune and vocals in music as well.
With ChatGPT, we are discussing a retrograde ‘scraper’ of content with no flare, no magic, no insight, no personality – none of the things you need to attract readers. It’s a search engine’s nightmare: just empty content nobody wants to read, that spiders have to sift through.
And with Midjourney, like digital musical instruments and studios we have simply placed the creative instruction further upstream, bypassing artistic ability (dexterity, the hands).
To those who believe machines, properly commanded by humans, can’t produce works of art, I encourage you to listen to Walking Wounded by Everything But The Girl, Achtung Baby by U2; every drop of humanity, impulse, and passion is still there. Produced digital-first, much of it electronically programmed, tweaked, by people who had the thought, the inspiration, the humanity, and who effectively used the technology as a tool, a servant.
To those who still don’t believe that this isn’t The End for creatives, and who can’t abide that nothing came of the 10 times we’ve seen this before in history, consider Top Gun: Maverick. The critics and fans alike gushed over the use of practical effects, which everybody could feel if not see. There’s still very much a place for the old ways. The new ways will succeed inasmuch as they are driven and mastered by the most human-centric artists and writers, and we still love the old ways. So, nothing was really changed by CGI, and nothing will change from text-to-image, or ChatGPT. Not really.
Old or new modalities: inasmuch as your job requires actual creativity and thought, and you are good at those, you’ll be fine.
These technologies are tools. They are no different than a light box, or a microphone, or a Roland synthesizer, or a Canon HD MiniDV camera, or Adobe After Effects, or Autodesk Maya, or…
Dexter Jettster: …Those analysis droids only focus on symbols. Huh! I should think that you Jedi would have more respect for the difference between knowledge and… heh heh heh… wisdom.
Obi-Wan: Well if droids could think, there’d be none of us here, would there?
~ Star Wars Episode II: Attack Of The Clones