I've already spoken to you about some the advances of Generative Adversarial Networks (GAN) and last week I showed you how you can get the most from ML and MI in BricsCAD. This week I'm going to introduce you to the company NVIDIA and some of the advancements they are making. If you're a gamer you may have already heard of them for their graphics cards. If not, all you need to know is that they are the company that has brought AI design to the next level. Just look at the AI generated images in the header!
What's new in AI design?
You've already seen some of the wonky churches and 3 legged dogs generated by GAN. The images produced by NVIDIA are of a much higher caliber, due to their new approach. The technique works by separating source images into "styles" and then uses GAN to apply these styles to existing images, read the paper here. If you are familiar with google deep dream generator, you'll understand the principle.
Examples of how AI can automatically place styles "on top" of images.
Prior to this GAN has generally been something of a "black box", but this new technique allows the team to manipulate the images in fine detail. It applies these styles to a reference image, at controllable and varying degrees. The styles are split up into 3 categories; corse styles, middle styles and fine styles. In photographs of humans; corse styles: pose hair and face shape, middle styles: facial features and eyes, and fine styles: color scheme.
Images of real people are combined to create photo-realistic, AI designed people. Left: reference image. Top: style source.
This technology isn't so much creating images from scratch as it is using AI to carefully and cleverly combine existing images. As such, you get far more realistically generated images and this isn't just restricted to human faces! In photographs of bedrooms; coarse styles: camera angle, middle styles: furniture, and fine styles colors and material details.
Bedrooms, far more realistic than those shown in the last article. Some of those beds even look comfy!
In photographs of cars; coarse styles: camera angle, middle styles: car shape, and fine styles colors.
Cars generated using the same technique. Note how the yellow car even has its hood up with a surprisingly realistic-looking engine.
The team can also control the "noise", added each style layer and the degree to which this affects the original image. It is possible to manipulate the inconsequential details of an image by manipulating noise on the high-level style. This affects details such; as freckles and hair strands in people, headlamps and backgrounds on cars, and material details in bedrooms. The noise generally gives the image a more detailed and seemingly more realistic finish, whilst removing noise completely results in a more "painterly" feel.
High-resolution GAN
The advancements in image creation don't stop at style layers. The team are also developing ways of improving GAN image resolution. They do this by increasing the image size incrementally. At each incremental increase they apply the same "checking" technique as other GAN generated images. This technique produced 1024 x 1024 images, that although still far from perfect, have a much greater level of detail and, more interestingly, accuracy than has been seen before.
High-resolution GAN demonstrates superior quality, realism, and resolution, although a few of those cars look like they might have had a nasty accident.
Some of these bedroom designs look almost liveable and some of these building look a lot more solid than the ones I wrote about in the last article!
The takeaway?
What this means is that in the future, it may be possible to have completely computer generated celebrities, that endorse products and film action scenes, all without needing hair and make-up, body doubles, lunch breaks, 7 figure salaries or "diva moments"!
But don't worry, developments in 3D GAN are still a long way off!