raphiki for Technology at Worldline

Posted on Jan 30

Discover the new OpenAI Embeddings APIs

#openai #embeddings #rag #llm

Dive into the latest advancements in AI with OpenAI's newly released Embeddings models. This guide offers a detailed exploration of their capabilities and integration using Node.js.

To begin, you'll need a paid account and an API key. Store the key in a .env file as shown:



GPT_KEY=sk-<the-rest-of-your-OpenAI-key>

Then, incorporate it into the execution context:



var dotenv = require("dotenv");
dotenv.config();

Embeddings

Embeddings are numerical representations of concepts in sequences of numbers, enabling computers to grasp the relationships between these concepts.

They facilitate tasks like clustering, search, or retrieval in machine learning models and algorithms, powering applications like similarity search or Retrieval Augmented Generation (RAG).

The current OpenAI Embedding models include:

text-embedding-ada-002: the older model
text-embedding-3-small: smaller and more efficient
text-embedding-3-large: larger and more robust

The first model was launched in December 2022 as the second generation of OpenAI embedding models.

The latter two, introduced in January 2024, represent the third generation. They are multilingual, and boast improved performance and cost-efficiency.

Simple API Calls

Now, let's use and compare these three Embeddings models with straightforward API calls, vectorizing a short sentence for each. We'll employ the LangChain framework for simpler coding.

text-embedding-ada-002



var { OpenAIEmbeddings } = require ("langchain/embeddings/openai");

var embedV2 = new OpenAIEmbeddings({
  openAIApiKey: process.env.GPT_KEY,
  modelName: "text-embedding-ada-002"
});

async function getEmbedding(embedModel) {
    var res = await embedModel.embedQuery("The GenAI's Lamp makes your wishes come true!");
    console.log(res);
}

getEmbedding(embedV2);

[
   0.0027452074,   -0.017593594,  -0.0027131883,  -0.004108542,  -0.018618202,
     0.01779582,    -0.01828116,   -0.018550793,  -0.010603342,  -0.032140326,
    0.011425724,    0.011553801,  -0.0012276756, -0.0043411013,   0.014856813,
  -0.0052881893,    0.021799877,   -0.005918458,   0.007313812, -0.0053320047,
  -0.0073407753,    0.029551845,  -0.0033299753,  -0.022837967,  -0.015935346,
   -0.014020949,     0.01811938,   -0.022527888,   0.001973381,   0.020896606,
    0.016582469,  -0.0013363716,   -0.010684232,   -0.01580053, -0.0146141425,
    0.005365709,    0.005244374,   0.0154365245,   0.012261589, -0.0070104743,
     0.03329975, -0.00082912337, -0.00095467153,  0.0071318094,  -0.022999749,
  0.00013039313,  -0.0026980215,    0.011459429,  -0.009282137,  -0.003784982,
   0.0011189795,    0.009201247,   0.0031311205,  -0.029713625,  -0.018550793,
   -0.004266952,   -0.006033052,   -0.013299678,   0.012490777,  -0.015530896,
  -0.0035659047,   -0.017431814,    -0.01371761,  0.0035456822,  -0.002020567,
  -0.0057870117,   -0.001467818,    0.001700377, -0.0005839253,  -0.026127499,
    0.026815064,   -0.009039467,    0.008351902, -0.0017155439,   0.012079586,
   -0.009416955,   -0.003383902,    0.013110935,  -0.011722322,   0.005564564,
    0.013926577,   -0.010758381,   0.0044556954,  0.0018217121,   0.006039793,
    0.013845687,     -0.0044894,     0.01539608,   -0.03135839,  -0.025884828,
    0.002613761,    0.018105898,    0.024671476,  -0.006488059,  -0.002575001,
    0.009962962,    0.003953503,     0.01106172,    0.00212505,  -0.020222522,
  ... 1436 more items
]

This approach yields a 1536-dimensional vector.

text-embedding-3-small



var embedV3Small = new OpenAIEmbeddings({
  openAIApiKey: process.env.GPT_KEY,
  modelName: "text-embedding-3-small"
});

getEmbedding(embedV3Small);

[
    0.002214067,    -0.0390087,   0.018476259,   0.010861045,  -0.0148632545,
    -0.08788707,  -0.014547484,  0.0018661683,  -0.007196635,   -0.046058465,
   -0.042856697,   -0.01036903,   -0.04447227,  -0.056515615,    0.044149153,
   -0.011771639,   -0.03827435, -0.0089150155,   0.002790532,    0.014885285,
     0.01837345, -0.0025592116,  -0.003936119,  -0.043855414,   -0.014180309,
   0.0022526202,   0.033280768,   0.028683731,   0.034103237,   0.0045162556,
    0.009105947,  -0.026421933,   -0.00768865,  -0.060451735,   -0.028815914,
    0.054253817,   0.030372739,   0.012087409,  -0.052755743,  -0.0114118075,
  -0.0003013132,  -0.023102667,   0.019004991,   0.038832456,    0.008870955,
    0.035777558,  -0.027523458,   0.018123772,   0.011242907,     0.03704064,
   0.0027483068,   0.050082706,  -0.039831173,   0.022779554,    -0.06239042,
  -0.0043032942, -0.0072076507,     0.0390087, -0.0020084486,   -0.015685728,
   0.0077327113,   0.026157565, -0.0041821264,   0.011852417,   -0.030225867,
   -0.008290818,  -0.031958934,    0.05384258,  0.0028602954, -0.00078024744,
   -0.021619279,   0.053049482,  -0.042974193,  -0.052902613,    0.011749608,
  -0.0069836737,  -0.008826894,   0.042474836,   0.005610438,    0.015318552,
   -0.016302582,   -0.02092899,   -0.06773649,   -0.02780251,  -0.0008949897,
   -0.077018686,  -0.022485813,  -0.013732355,   -0.03886183,    0.008386283,
    0.005999644,  -0.019460289,  -0.064446606, -0.0129833175,   0.0063227583,
    0.050082706,   0.026642237,  -0.010075289, 0.00020527393,    0.062331673,
  ... 1436 more items
]

Here, the vector maintains the same dimensionality, but the values differ.

text-embedding-3-large



var embedV3Large = new OpenAIEmbeddings({
  openAIApiKey: process.env.GPT_KEY,
  modelName: "text-embedding-3-large"
});

getEmbedding(embedV3Large);

[
   -0.004331901,  -0.04381693, -0.014906949,   -0.017649338,  -0.015994713,
    -0.00707429,  0.019074155, -0.011957733,     0.04482809,   0.019870825,
   -0.023976749,  0.016469652, -0.012333088,    0.001856666,   0.014753743,
    0.004757048, -0.017863827,  0.016546255,    0.029890502,  -0.031591088,
   -0.034226235,  -0.02886402,  0.032173272,   0.0036118329,    0.05542995,
   0.0050672903, -0.014317106, -0.008993195,   -0.027347282,   0.023088153,
  -0.0037248223,  0.006269958,  0.053652763,    -0.03266353,  -0.023639694,
  -0.0069670454,  0.033552125, 0.0072543067,   -0.033215072, -0.0063925227,
    -0.02373162,  0.015481472, -0.021050513,    -0.02100455,   -0.03128468,
  -0.0027921805,   0.02023852,  0.004531069,  -0.0059060934, -0.0011088288,
    0.018522613,  0.002074027,  -0.01795575,    0.018277483,   0.016224522,
    0.015075476,  0.030028388, -0.008173543,    0.027331961,   0.022322122,
   -0.015872147, 0.0068636313, -0.061833967,    0.025922464,  -0.026948946,
    -0.06143563,   0.02400739,   0.05037415,    0.030901661,   0.032203913,
   -0.016484972, 0.0050443094, -0.066859126,   -0.013865149,   0.028772097,
  -0.0070244977,  0.010785706, 0.0056456435,    0.019135436,  0.0020893477,
     0.03035012,  0.015205701,  0.050159663,   -0.012792706, -0.0034011744,
   0.0043357313, -0.016806705,  -0.04973069, -0.00027744658,  -0.039435238,
   -0.004473617,  0.044552322, -0.013229343,     -0.0237929,  0.0027174924,
    0.016607536, -0.013367228, 0.0004562668,   -0.015619358,   0.068881445,
  ... 2972 more items
]

This time, the vector expands to 3072 dimensions, doubling the size of the first two models.

Shortening Embeddings

The latest OpenAI models now also allow for embedding shortening. This means you can truncate numbers from the end of the returned vectors without losing semantic value. This feature optimizes the size and cost of local vector stores using the dimensions parameter.

Let's demonstrate this with direct API calls, as LangChainJS does not yet support the dimensions parameter.



fetch('https://api.openai.com/v1/embeddings', {
  method: "POST",
  headers: { 
    "Content-Type": "application/json",
    "Authorization": "Bearer " + process.env.GPT_KEY,
  },
  body: JSON.stringify({
    model: "text-embedding-3-small",
    input: "The GenAI's Lamp makes your wishes come true!",
    dimensions: 512
  })
})
  .then((response) => response.json())
  .then((json) => console.log(json.data[0].embedding));

[
    0.0032793144,  -0.057776842,   0.027365686,   0.016086586,  -0.022014368,
     -0.13017192,  -0.021546671,  0.0027640322,  -0.010659131,   -0.06821844,
     -0.06347621,   -0.01535785,   -0.06586908,   -0.08370681,   0.065390505,
    -0.017435294,  -0.056689173,   -0.01320427,  0.0041331323,   0.022046998,
     0.027213413, -0.0037905176,  -0.005829892,  -0.064955436,  -0.021002838,
    0.0033364168,    0.04929304,    0.04248425,    0.05051123,  0.0066891485,
     0.013487063,  -0.039134238,  -0.011387868,    -0.0895367,   -0.04268003,
       0.0803568,   0.044985883,   0.017902989,   -0.07813796,  -0.016902337,
  -0.00044628314,  -0.034217987,   0.028148808,     0.0575158,    0.01313901,
     0.052991107,  -0.040765736,   0.026843607,   0.016652172,   0.054861896,
    0.0040705916,    0.07417885,  -0.058995027,   0.033739414,   -0.09240814,
   -0.0063737254,  -0.010675446,   0.057776842, -0.0029747677,  -0.023232555,
    0.0114531275,   0.038742676, -0.0061942604,   0.017554937,   -0.04476835,
    -0.012279754,   -0.04733524,     0.0797477,  0.0042364607, -0.0011556456,
      -0.0320209,   0.078573026,  -0.063650236,   -0.07835549,   0.017402662,
    -0.010343708,  -0.013073751,   0.062910624,   0.008309771,   0.022688722,
    -0.024146196,  -0.030998493,   -0.10032635,   -0.04117905, -0.0013255934,
     -0.11407445,  -0.033304345,  -0.020339362,  -0.057559308,   0.012421151,
     0.008886235,   -0.02882316,  -0.095453605,  -0.019229943,   0.009364808,
      0.07417885,   0.039460536,  -0.014922784, 0.00030403674,    0.09232113,
  ... 412 more items
]



fetch('https://api.openai.com/v1/embeddings', {
  method: "POST",
  headers: { 
    "Content-Type": "application/json",
    "Authorization": "Bearer " + process.env.GPT_KEY,
  },
  body: JSON.stringify({
    model: "text-embedding-3-large",
    input: "The GenAI's Lamp makes your wishes come true!",
    dimensions: 1024
  })
})
  .then((response) => response.json())
  .then((json) => console.log(json.data[0].embedding));

[
  -0.0058769886,   -0.0594454, -0.020223908,   -0.023944441,   -0.02169965,
   -0.009597522,  0.025877457, -0.016222775,     0.06081722,   0.026958281,
    -0.03252869,  0.022343988,  -0.01673201,   0.0025188948,   0.020016057,
   0.0064537753, -0.024235433,  0.022447914,     0.04055174,  -0.042858887,
   -0.046433926, -0.039159138,   0.04364872,   0.0049000885,    0.07520051,
   0.0068746735,  -0.01942368, -0.012200857,   -0.037101414,   0.031323154,
  -0.0050533786,  0.008506305,   0.07278944,   -0.044313844,   -0.03207142,
   -0.009452026,   0.04551938,  0.009841748,    -0.04506211,  -0.008672585,
   -0.032196127,  0.021003349, -0.028558735,   -0.028496379,  -0.042443186,
  -0.0037880854,  0.027457124,  0.006147195,   -0.008012658,  -0.001504322,
    0.025129192, 0.0028137837, -0.024360143,    0.024796631,   0.022011427,
    0.020452544,  0.040738806, -0.011088854,    0.037080627,   0.030283898,
    -0.02153337,  0.009311727,  -0.08388869,      0.0351684,     -0.036561,
   -0.083348274,   0.03257026,   0.06834143,    0.041923556,    0.04369029,
   -0.022364773,  0.006843496,   -0.0907062,    -0.01881052,   0.039034426,
   -0.009529971,  0.014632714, 0.0076593114,    0.025960596,  0.0028345687,
    0.041175295,  0.020629218,   0.06805044,   -0.017355563, -0.0046142936,
    0.005882185, -0.022801261,  -0.06746845, -0.00037640528,   -0.05350086,
   -0.006069251,  0.060443085, -0.017947938,   -0.032279268,   0.003686758,
    0.022531055, -0.018135004, 0.0006190064,   -0.021190414,    0.09344983,
  ... 924 more items
]

Comparing RAG Usage

Let's explore the RAG pattern with a single PDF file and assess the outcomes. Up to now, we've focused on obtaining embedding vectors. It's time to put them to use.

Before invoking a Completion API, we first retrieve relevant information from local sources (like documents) based on the user's prompt.

Here’s how we can apply this technique with a single PDF document.



var { PDFLoader } = require("langchain/document_loaders/fs/pdf");
var { MemoryVectorStore } = require ("langchain/vectorstores/memory");
var { ChatOpenAI } = require ("langchain/chat_models/openai");
var { RetrievalQAChain } = require("langchain/chains");

// Access to ChatCompletion API
var chat = new ChatOpenAI({
  openAIApiKey: process.env.GPT_KEY,
  modelName: "gpt-4"
});

var question = "Summarize the content of the document";

async function getVectorStores() {
  //Load our local PDF document
  var loader = new PDFLoader("LLM Openness.pdf");
  var docs = await loader.load();

  //Create a vectorstore from text-embedding-ada-002 vectors
  var vectorStoreV2 = await MemoryVectorStore.fromDocuments(
    docs,
    embedV2
  );

  //Create a vectorstore from text-embedding-3-small vectors
  var vectorStoreV3Small = await MemoryVectorStore.fromDocuments(
    docs,
    embedV3Small
  );

  //Create a vectorstore from text-embedding-3-large vectors
  var vectorStoreV3Large = await MemoryVectorStore.fromDocuments(
    docs,
    embedV3Large
  );

  //RAG with text-embedding-ada-002 vectors
  var chainV2 = RetrievalQAChain.fromLLM(
    chat, 
    vectorStoreV2.asRetriever()
  );

  await chainV2.call({ query: question })
    .then((res) => console.log("\n\nWith text-embedding-ada-002:\n" + res.text));

  //RAG with text-embedding-3-small vectors
  var chainV3Small = RetrievalQAChain.fromLLM(
    chat, 
    vectorStoreV3Small.asRetriever()
  );

  await chainV3Small.call({ query: question })
    .then((res) => console.log("\n\nWith text-embedding-3-small:\n" + res.text));

  //RAG with text-embedding-3-large vectors
  var chainV3Large = RetrievalQAChain.fromLLM(
    chat, 
    vectorStoreV3Large.asRetriever()
  );

  await chainV3Large.call({ query: question })
    .then((res) => console.log("\n\nWith text-embedding-3-large:\n" + res.text));

}

getVectorStores();

With text-embedding-ada-002:

The document discusses the concept of openness in Generative AI, specifically focusing on Language Learning Models (LLMs). The authors present a framework to evaluate the level of openness of an LLM and its potential for use and reuse.

The document covers the evolution of LLMs and their key components, starting with the early Natural Language Processing (NLP) methods like Word2Vec and GloVe, and leading to revolutionary models like Google's BERT and OpenAI's GPT series. 

Openness in AI is framed using a score system: Score 1 signifies research only with no extra info; Score 2 represents restricted access; Score 3 implies openness with certain usage limitations; and Score 4 indicates total openness with zero restrictions.

The authors argue that true openness in AI does not just imply access to open source models or datasets, but also to the frameworks and ecosystems supporting these models. They stress that clarity about licensing terms is paramount for collaborative development.

The document makes a case for the "Linux moment" of AI, referring to a future where freedom of innovation and collaborative work, along with commercial enterprise, becomes the norm.

The authors also discuss LLMs such as LLaMA, acknowledging their contribution to the field despite their limited openness. They note a trend toward open AI, emphasizing collaboration, transparency, and democratization, and argue that the choices made now will shape the future of technology and digital society.

With text-embedding-3-small:

The document discusses the level of openness in Generative AI, specifically focusing on various AI models developed by OpenAI, Google, and Meta (formerly Facebook). 

OpenAI's GPT-3 and GPT-4 models were initially open source but subsequent versions are not as open. ChatGPT's components, like model weights, fine-tuning dataset, reward model, and data processing code, are not publicly released. 

Similarly, Google initially emphasized openness in AI, as seen in the open-source nature of Google’s BERT model, but moved towards proprietary models, such as Bard powered by PaLM 2, which is not open source. 

In contrast, Meta's strategy fluctuated between open-source and proprietary stances. RoBERTa, an open-source model, was well-received, while LLM was proprietary and led to criticism. Later, Meta introduced LLaMA, a model that is publicly accessible but has limitations or restrictions under its license.

Overall, OpenAI, Google, and Meta appear to have shifted from open-source research to a more proprietary, commercial approach to safeguard intellectual assets and mitigate the misuse of models.

With text-embedding-3-large:

The document titled "How Open is Generative AI?" by Luxin Zhang and Raphaël Semeteys explores the history, current landscape, and future of openness in the development of Large Language Models (LLMs). The authors draw parallels between the AI landscape and the shared history of open source software and the internet, both of which transitioned from academic research to business-centric phenomena. The challenges and progress in the realm of Generative AI are compared to the Linux movement in terms of openness and collective construction.

The document discusses the nuances of licenses, sharing, and reusing different components, such as foundational model, weights, datasets, and code. It proposes a framework to evaluate the level of openness of a LLM and the potential restrictions to its use and reuse. A range of scores from 1 to 4 is suggested to represent different levels of openness, from research papers only with no other information (score 1) to a model totally open without restriction (score 4).

The evolution of Generative AI is said to fluctuate between open collaboration and proprietary control. Centralization of AI computing power tends to support closed-source solutions, while democratization supports open-source alternatives. The authors emphasize the importance of shared knowledge, collective growth, and freedom to innovate within the open AI movement. The decisions made now about centralization versus democratization and closing or opening up will shape not only the future of technology but also our digital society.

Results with the three different embedding models show variations in the depth and detail of the extracted information. The text-embedding-3-large vectors seem to provide longer and more detailed responses, although it's challenging to draw definitive conclusions from a single document.

Conclusion

In summary, OpenAI's latest release of the Embeddings models - text-embedding-ada-002, text-embedding-3-small, and text-embedding-3-large - marks a significant advancement in the field of machine learning and natural language processing. These models offer varied capabilities, from efficient, compact vector representations to more detailed, robust embeddings, catering to a diverse range of applications.

Our exploration through Node.js and the LangChain framework demonstrates the practicality and versatility of these models. Whether it's generating concise vectors with the ability to shorten embeddings or harnessing the power of the Retrieval Augmented Generation (RAG) pattern for in-depth information retrieval, these models open up new possibilities for developers and researchers alike.

What stands out is the ease of integration and the potential for optimization in terms of performance and cost-effectiveness. The ability to tailor the embeddings' dimensions offers a notable advantage, particularly for applications where storage and computational efficiency are crucial.

As we continue to delve into the capabilities of these models, it becomes evident that the choices made by developers today will significantly shape the future landscape of AI applications. The OpenAI Embeddings models, with their improved multilingual support and enhanced performance, are poised to play a pivotal role in the evolution of AI, pushing the boundaries of what's possible in the realms of language understanding and information processing.

DEV Community

Discover the new OpenAI Embeddings APIs

Embeddings

Simple API Calls

text-embedding-ada-002

text-embedding-3-small

text-embedding-3-large

Shortening Embeddings

Comparing RAG Usage

Conclusion

Top comments (0)

Read next

RAG Explained: Generation of Embeddings

Tech Spotlight: Daily Tech News

LLM Models and RAG Applications Step-by-Step - Part III - Searching and Injecting Context

LLM Models and RAG Applications Step-by-Step - Part II - Creating the Context