３種類の画像生成AIを比べてみた（DALL-E 3 / Midjourney / Stable Diffusion）

😃 メイン

まずはじめに

私はAIに関してはど素人です。むしろちょっと楽しそうだなくらいで触ってる人と思ってください。

🎃ハロウィンなので魔女（の予定でした）🧙‍♀

魔法使い風ですね。

Proptを考えるのめんどいので、まずはDALL-Eで対話しながら進めてイメージを作ってから、Promptを教えてもらって、それを使ってそれぞれのサービスで検証してみました。ハロウィンなので魔女にした

Ultra high-resolution 8k image of a beautiful Japanese girl with a bright and high-tension smile. Her hair is styled in a semi-long fashion, brown with shades of green on the inner layers. She stands by the sea, donning a witch cosplay outfit. The image captures the ambiance of a bright morning sun, illuminating her in a warm glow. Her cheerful demeanor stands out, complementing the atmosphere created by the use of a 35mm lens and f/1.8 aperture. The image is rotated 90 degrees to the right.

Midjourney（にじジャーニー）

にじジャーニー自体が、「あなただけのオリジナルアニメイラストを描く最先端AI「にじジャーニー」」とあるので、イラストは本当にきれいにできますね。

服装が魔女だったり、「薬屋のひとりごと」の主人公の服装に近かったり

16:9に変更した。帽子あったりなかったり

これで

DALL-E 3

Midjourneyではイラスト（限りなくリアルに近い）を作成したので、それに合わせてDALL-Eでもイラスト感出していきます。

イラスト化してと書いたら、めっちゃイラスト化された。Midjorneyに近づけていきます。恋愛系ゲーム風ですね。

目が小さくなった。まだ恋愛系ゲーム風ですね。

FFとか最近のゲーム感がでた

カメラの距離とか、背景とか諸々変えて、近づけれたかな。

Stable Diffusion

Modelは、https://civitai.com/models/83096/yayoimix を利用

何度やっても服装が和の方面に向かうので、一旦諦め。使ってるモデル次第では簡単にエロに走るので、それはそれで面倒。ただモデルとPromptそして、Negative Promptをうまく使いこなしていくと、表現幅は広いと思うが、難易度たかいなーという印象。あとスペックいる。

あとあと

https://civitai.com/models/6250/dosmix を使えばもう少しいい感じになりそうだけど、エロとの戦いで敗北 ☹️

最終的な比較

色々チューニングせずほぼ同じPromptで実行した場合の結果

Midjourney

DALL-E 3

Stable-Diffusion

ゲームやアニメなどなどイラスト的な感じで強く書き出したいなら、Midjourneyを使う方が簡単で満足度高い、色々チューニングして、ある程度いい感じに出したいなら、DALL-E 3が良い。

Stable-Diffusionはもう細かくチューニングしまくりたい、好みのmodelで書き出したいとかなら、良さそう。あとエロとか版権系とか気にせずの場合はStable-Diffusionですね。という雑感想

Ultra high-resolution 8k image of a beautiful Japanese girl with a bright and high-tension smile. Her hair is styled in a semi-long fashion, brown with shades of green on the inner layers. She stands by the sea, donning a witch cosplay outfit. The image captures the ambiance of a bright morning sun, illuminating her in a warm glow. Her cheerful demeanor stands out, complementing the atmosphere created by the use of a 35mm lens and f/1.8 aperture. The image is rotated 90 degrees to the right.