Today, we’ll review a very exciting development in the space of artificial intelligence. In recent years, AI has largely been met with a lukewarm response from the average consumer, as people try to apply relevancy to their own lives or occupation. Their experiences to date may perhaps consist of Netflix recommendations or targeted advertisements. However, OpenAI’s DALL-E 2 launch (following the original DALL-E launch one year ago) is both an incredible technological feat and something that can be appreciated by the average consumer. We explore this technology further in today’s newsletter.
Who is OpenAI?
OpenAI is an AI research and deployment company based in San Francisco.
OpenAI’s mission is to ensure that artificial general intelligence (AGI)—by which we mean highly autonomous systems that outperform humans at most economically valuable work—benefits all of humanity.
We (OpenAI) will attempt to directly build safe and beneficial AGI, but will also consider our mission fulfilled if our work aids others to achieve this outcome.
OpenAI Inc. is a non-profit company that also operates a “capped-profit” arm under OpenAI LP, with Sam Altman as its CEO.
What is GPT-3?
GPT-3, or the third generation Generative Pre-trained Transformer, is a machine learning model trained using internet data to generate any type of text. Developed by OpenAI, it requires a small amount of input text to generate large volumes of relevant and sophisticated machine-generated text.
GPT-3's deep learning neural network is a model with over 175 billion machine learning parameters. Prior to GPT-3, the largest trained language model was Microsoft's Turing NLG model, which had 10 billion parameters. To date, GPT-3 is the largest neural network ever produced and is vastly better than any prior model for producing text – enough to convince audiences that a human could have written it.
Introducing DALL-E
The name, DALL-E, is a portmanteau of the artist Salvador Dalí and Pixar’s WALL·E
DALL-E 2 is a 12-billion parameter version of GPT-3 trained to generate images from text descriptions, using a dataset of text-image pairs. It has a diverse set of capabilities, including creating anthropomorphized versions of animals and objects, combining unrelated concepts in plausible ways, rendering text, and applying transformations to existing images. -OpenAI, https://openai.com/blog/dall-e/
In the grand scheme, a more robust understanding of how deep learning systems can interpret natural language can help us use AI to augment activities effectively in the future. DALL-E is making huge strides in that direction and has allowed AI researchers at OpenAI to very successfully pair NLP (natural language processing) with an image-based output. These text and image pairs continuously learn and iterate so that users of the model create and edit existing images using simple text.
Let’s see what that looks like in practice:
DALL-E can not only interpret plain text but is able to do so with multivariate instructions with an unbelievably accurate result.
More Than Just a Fun Internet Tool
Though the results and output of the DALL-E are fascinating and impressive, it’s important to dig deeper than the face value to see the broader impact that this will have on the technology community and, ultimately, a broader population. We can summarize those points as follows:
Computer Interface: The project from OpenAI represents a much broader opportunity set – which is the ability for humans to effectively interface with computer applications using only their natural language. This technology spans any type of application or service that we use today.
Relationship Understanding: Most effectively displayed by the final tweet image from Flexport CEO, Ryan Petersen, DALL-E can accurately depict several different vectors of relationships. This reflects the clear use cases and viable application for multi-dimensional models in artificial intelligence.
AI Predictions Are Difficult: Years ago, the general consensus was that AI would first replace several forms of physical labor and then creative work much later on. As we can see in today’s example, AI can both augment human tasks and complete several tasks autonomously without human intervention.
Currently, OpenAI is taking waitlist sign-ups for its DALL-E 2 technology, with Sam Altman stating that the company expects to release the product to users this summer. At that time, users will be able to apply the technology to any application they see fit (within OpenAI’s guidelines). Currently, OpenAI’s technology can be used via API, though the project needs to be approved beforehand. Waitlist: https://labs.openai.com/waitlist
In aggregate, we are incredibly excited about the future of artificial intelligence and have thought about the labor-based benefits (and implications) at length. Though, this recent display of creative work from an AI-based model has caused us to rethink both the timeline and magnitude of impact that can occur in the next few years.
References:
“DALL·E: Creating Images from Text.” https://openai.com/blog/dall-e/. Accessed April 12, 2022.
“DALL·E 2.” https://openai.com/dall-e-2/. Accessed April 12, 2022.
“[2005.14165] Language Models are Few-Shot Learners.” https://arxiv.org/abs/2005.14165. Accessed April 12, 2022.
“Top 10 Images Generated by DALL-E 2.” https://www.linkedin.com/pulse/top-10-images-generated-dall-e-2-kd-deshpande/. Accessed April 12, 2022.
“Sam Altman.” https://blog.samaltman.com/. Accessed April 12, 2022.
This letter is not an offer to sell securities of any investment fund or a solicitation of offers to buy any such securities. An investment in any strategy, including the strategy described herein, involves a high degree of risk. Past performance of these strategies is not necessarily indicative of future results. There is the possibility of loss and all investment involves risk including the loss of principal.
Any projections, forecasts and estimates contained in this document are necessarily speculative in nature and are based upon certain assumptions. In addition, matters they describe are subject to known (and unknown) risks, uncertainties and other unpredictable factors, many of which are beyond Drawing Capital’s control. No representations or warranties are made as to the accuracy of such forward-looking statements. It can be expected that some or all of such forward-looking assumptions will not materialize or will vary significantly from actual results. Drawing Capital has no obligation to update, modify or amend this letter or to otherwise notify a reader thereof in the event that any matter stated herein, or any opinion, projection, forecast or estimate set forth herein, changes or subsequently becomes inaccurate.
This letter may not be reproduced in whole or in part without the express consent of Drawing Capital Group, LLC (“Drawing Capital”). The information in this letter was prepared by Drawing Capital and is believed by the Drawing Capital to be reliable and has been obtained from sources believed to be reliable. Drawing Capital makes no representation as to the accuracy or completeness of such information. Opinions, estimates and projections in this letter constitute the current judgment of Drawing Capital and are subject to change without notice.