Ora-lytics

OCI Stored Video Analysis

Posted on July 7, 2025 Updated on May 3, 2025

OCI Video Analysis is a part of the OCI Vision service, designed to process stored videos and identify labels, objects, texts, and faces within each frame. It can analyze the entire video or every frame in the video using pre-trained or custom models. The feature provides information about the detected labels, objects, texts, and faces, and provides the time at which they’re detected. With a timeline bar, you can directly look for a label or object and navigate to the exact timestamp in the video where a particular label or object is found.

To use this feature, you’ll need to upload your video to an OCI Bucket. Here is an example of a video stored in a bucket called OCI-Vision-Video-Demos.

You might need to allow Pre-Authenticated Requests for this bucket. If you don’t do this, you will be prompted by the tool to allow this.

https://www.dropbox.com/scl/fi/m61dk7evmcgk5novq34uf/Screenshot-2025-05-03-at-17.38.59.png?rlkey=ufie8eni2zbr5jwyyddsa1thb&dl=0

Next, go to the OCI Vision page. You’ll find the Video Analysis tool at the bottom of the menu on the left-hand side of the page.

You can check out the demo videos, or load your own video from the Local File system of you computer, or use a file from your OCI Storage. If you select a video from the Local File system, the video will be loaded in the object storage before it is processed.

For this example, I’m going to use the video I uploaded earlier called Trinity-Student.mp4. Copy the link to this file from the file information in object storage.

On the Video Analysis page, select Object Storage and paste the link to the file into the URL field. Then click Analyze button. It is at this point that you might get asked to Generate a PAR URL. Do this and then proceed.

While the video is being parsed, it will appear on the webpage and will start playing. When the video has been Analyzed the details will be displayed below the video. The Analysis will consist of Label Detection, Object Detection, Text Dection and Face Detection.

By clicking on each of these, you’ll see what has been detected, and by clicking on each of these, you’ll be able to see where in the video they were detected. For example, where a Chair was detected.

You can also inspect the JSON file containing all the details of various objects detected in the video and the location in the videos they can be found.

This JSON response file is also saved to Object Storage in the same directory, or a sub-directory, where the video is located.

This entry was posted in OCI, OCI Vision and tagged OCI, OCI AI Services, OCI Vision.

Hybrid Vector Search Index – An example

Posted on June 25, 2025

Over the past couple of years, we have seen Vector Search and Vector Indexes being added to Databases. These allow us to perform similarity searches on text, images and other types of object, with text being the typical examples demonstrated. One thing that seems to get lost in all the examples, is that the ability to perform text search in Databases has been around for a long, long time. Most databases come with various functions, search features, similarity search and indexes to find the text you are looking for. Here are links to two examples DBMS_SEARCH and Explicit Semantic Analysis [also check out Oracle Text]. But what if, instead of having multiple different features, which for each need setting up and configuring etc, you could combine Vector and Text Search into one Index. This is what Hybrid Vector Search Index does. Let’s have a look at how to setup and use this type of Index.

A Hybrid Vector Search index combines an Oracle Text domain index with an AI Vector Search index, which can be either an HSNW or IVF vector index. One key advantage of using a hybrid vector index is that it handles both chunking and vector embedding automatically during the index creation process, while at the same time setups up Oracle Text text search features. These are combined into one index.

Chunking in Vector Indexes is a technique used to break down large bodies of text into smaller, more manageable segments. This serves two main purposes: first, it improves the relevance of the content being vectorized, since lengthy text can contain multiple unrelated ideas that may reduce the accuracy of similarity searches. Second, it accommodates the input size limitations of embedding models. There’s no universal rule for determining the best chunk size—it often requires some trial and error to find what works best for your specific dataset.

Think of a “hybrid search” as a search engine that uses two different methods at once to find what you’re looking for.

Keyword Search: This is like a standard search, looking for the exact words you typed in.
Similarity Search: This is a smarter search that looks for things that are similar in meaning or concept, not just by the words used.

A hybrid search runs both types of searches and then mixes the results together into one final list. It does this by giving each result a “keyword score” (for the exact match) and a “semantic score” (for the conceptual match) to figure out the best ranking. You also have the flexibility to just do one or the other—only a keyword search or only a similarity search—and you can even adjust how the system combines the scores to get the results you want.

The example below will use a Wine Reviews (130K) dataset. This is available on Kaggle and other websites.This data set contain descriptive information of the wine with attributes about each wine including country, region, number of points, price, etc as well as a text description contain a review of the wine. The following are 2 files containing the DDL (to create the table) and then Import the data set (using sql script with insert statements). These can be run in your schema (in order listed below).

Insert records into WINEREVIEWS_130K_IMP table

Create table WINEREVIEWS_130K_IMP

I’ve also loaded the all-MiniLM-L12-v2 ONNX model into my schema. We’ll use this model for the Vector Indexing. The DESCRIPTION column contains the wine reviews details. It is this column we want to index.

To create a basic hybrid-vector index with the detailed settings we can run.

CREATE HYBRID VECTOR INDEX wine_reviews_hybrid_idx
ON WineReviews130K(description)
PARAMETERS ('MODEL all_minilm_l12_v2');

You can alter the parameters for this index. This can be done by defining your preferences. Here is an example where all the preferences are the defaults used in the above CREATE. Alter these to suit your situation and text.

BEGIN
  DBMS_VECTOR_CHAIN.CREATE_PREFERENCE(
    'my_vector_settings',
     dbms_vector_chain.vectorizer,
        json('{
            "vector_idxtype":  "ivf",
            "distance"      :  "cosine",
            "accuracy"      :  95,
            "model"         :  "minilm_l12_v2",
            "by"            :  "words",
            "max"           :  100,
            "overlap"       :  0,
            "split"         :  "recursively"          }'
        ));
END;

CREATE HYBRID VECTOR INDEX wine_reviews_hybrid_idx 
ON WineReviews130K(description)
PARAMETERS('VECTORIZER my_vector_settings');

After creating the hybrid-vector index you can explore some of the characteristics of the index and how the text was chunked, etc, using the dictionary view which was created for the index, for example.

desc wine_reviews_hybrid_idx$vectors

Name                          Null?    Type
----------------------------- -------- --------------------
DOC_ROWID                     NOT NULL ROWID
DOC_CHUNK_ID                  NOT NULL NUMBER
DOC_CHUNK_COUNT               NOT NULL NUMBER
DOC_CHUNK_OFFSET              NOT NULL NUMBER
DOC_CHUNK_LENGTH              NOT NULL NUMBER
DOC_CHUNK_TEXT                         VARCHAR2(4000)
DOC_EMBEDDING                          VECTOR(*, *, DENSE)

You are now ready to start querying your data and using the hybrid-vector index. The syntax for these queries can be a little complex to start with, but with a little practice, you’ll get the hang of it. Check out the documentation for more details and examples. You can perform the following types of searches:

Semantic Document
Semantic Chunk Mode
Keyword on Document
Keyword and Semantic on Document
Keyword and Semantic on Chunk

Here’s an example of using ‘Keyword and Semantic on Document’.

select json_Serialize(
  DBMS_HYBRID_VECTOR.SEARCH(
    json(
      '{
         "hybrid_index_name" : "wine_reviews_hybrid_idx",
         "search_scorer"     : "rsf",
         "search_fusion"     : "UNION",
         "vector":
          {
             "search_text"   : "What wines are similar to Tempranillo",
             "search_mode"   : "DOCUMENT",
             "aggregator"    : "MAX",
             "score_weight"  : 1,
             "rank_penalty"  : 5
          },
         "text":
          {
             "contains"      : "$tempranillo"
             "score_weight"  : 10,
             "rank_penalty"  : 1
          },
         "return":
          {
             "values"        : [ "rowid", "score", "vector_score", "text_score" ],
             "topN"          : 10
          }
      }'
    )
  ) RETURNING CLOB pretty);

This search will return the top matching records. The ROWID is part of the returned results, allowing you to look up the corresponding records or rows from the table.

Check out the other types of searches to see what might work best for your data and search criteria.

This entry was posted in Vector Database, Vector Embeddings and tagged 23ai, AI, Artificial Intelligence, database, LLM, Oracle, Vector Embedding.

OCI Text to Speech example

Posted on June 3, 2025 Updated on April 30, 2025

In this post, I’ll walk through the steps to get a very simple example of Text-to-Speech working. This example builds upon my previous posts on OCI Language, OCI Speech and others, so make sure you check out those posts.

The first thing you need to be aware of, and to check, before you proceed, is whether the Text-to-Speech is available in your region. At the time of writing, this feature was only available in Phoenix, which is one of the cloud regions I have access to. There are plans to roll it out to other regions, but I’m not aware of the timeline for this. Although you might see Speech listed on your AI menu in OCI, that does not guarantee the Text-to-Speech feature is available. What it does mean is the text trans scribing feature is available.

So if Text-to-Speech is available in your region, the following will get you up and running.

The first thing you need to do is read in the Config file from the OS.

#initial setup, read Config file, create OCI Client
import oci
from oci.config import from_file
##########
from oci_ai_speech_realtime import RealtimeSpeechClient, RealtimeSpeechClientListener
from oci.ai_speech.models import RealtimeParameters
##########

CONFIG_PROFILE = "DEFAULT"
config = oci.config.from_file('~/.oci/config', profile_name=CONFIG_PROFILE)
###
ai_speech_client = ai_speech_client = oci.ai_speech.AIServiceSpeechClient(config)
###
print(config)
### Update region to point to Phoenix
config.update({'region':'us-phoenix-1'})

A simple little test to see if the Text-to-Speech feature is enabled for your region is to display the available list of voices.

list_voices_response = ai_speech_client.list_voices(
    compartment_id=COMPARTMENT_ID,
    display_name="Text-to-Speech")
#    opc_request_id="1GD0CV5QIIS1RFPFIOLF<unique_ID>")

# Get the data from response
print(list_voices_response.data)

This produces a long json object with many characteristics of the available voices. A simpler listing gives the names and gender)

for i in range(len(list_voices_response.data.items)):
    print(list_voices_response.data.items[i].display_name + ' [' + list_voices_response.data.items[i].gender + ']\t' + list_voices_response.data.items[i].language_description )
------
Brian [MALE]	English (United States)
Annabelle [FEMALE]	English (United States)
Bob [MALE]	English (United States)
Stacy [FEMALE]	English (United States)
Phil [MALE]	English (United States)
Cindy [FEMALE]	English (United States)
Brad [MALE]	English (United States)
Richard [MALE]	English (United States)

Now lets setup a Text-to-Speech example using the simple text, Hello. My name is Brendan and this is an example of using Oracle OCI Speech service. First lets define a function to save the audio to a file.

def save_audi_response(data):
    with open(filename, 'wb') as f:
        for b in data.iter_content():
            f.write(b)
    f.close()

We can now establish a connection, define the text, call the OCI Speech function to create the audio, and then to save the audio file.

import IPython.display as ipd

# Initialize service client with default config file
ai_speech_client = oci.ai_speech.AIServiceSpeechClient(config)

TEXT_DEMO = "Hello. My name is Brendan and this is an example of using Oracle OCI Speech service"

#speech_response = ai_speech_client.synthesize_speech(compartment_id=COMPARTMENT_ID)

speech_response = ai_speech_client.synthesize_speech(
    synthesize_speech_details=oci.ai_speech.models.SynthesizeSpeechDetails(
        text=TEXT_DEMO,
        is_stream_enabled=True,
        compartment_id=COMPARTMENT_ID,
        
        configuration=oci.ai_speech.models.TtsOracleConfiguration(
            model_family="ORACLE",
            model_details=oci.ai_speech.models.TtsOracleTts2NaturalModelDetails(
                model_name="TTS_2_NATURAL",
                voice_id="Annabelle"),
            speech_settings=oci.ai_speech.models.TtsOracleSpeechSettings(
                text_type="SSML",
                sample_rate_in_hz=18288,
                output_format="MP3",
                speech_mark_types=["WORD"])),
        
        audio_config=oci.ai_speech.models.TtsBaseAudioConfig(config_type="BASE_AUDIO_CONFIG") #, save_path='I'm not sure what this should be')
    ) )

# Get the data from response
#print(speech_response.data)
save_audi_response(speech_response.data)

This entry was posted in AI, OCI, OCI Speech and tagged AI, Artificial Intelligence, OCI, OCI AI Services, OCI Speech.

Translating Text using OCI AI Services

Posted on April 22, 2025 Updated on April 10, 2025

I’ve written several blog posts on using various features and functions of the OCI AI Services, and the most recent of these have been about some of the Language features. In this blog post, I’ll show you how to use the OCI Language Translation service.

As with the previous posts, there is some initial configuration and setup for your computer to access the OCI cloud services. Check out my previous posts on this. The following examples assume you have that configuration setup.

The OCI Translation service can translate text into over 30 different languages, with more being added over time.

There are 3 things needed to use the Translation Service. The Text to be translated, what language that text is in and what language you’d like the text translated into. Sounds simple. Well, it kind of is, but some care is needed to ensure it all works smoothly.

Let’s start with the basic setup of importing libraries, reading the config file and initialising the OCI AI Client.

import oci
from oci.config import from_file

#Read in config file - this is needed for connecting to the OCI AI Services
#COMPARTMENT_ID = "ocid1.tenancy.oc1..aaaaaaaaop3yssfqnytz5uhc353cmel22duc4xn2lnxdr4f4azmi2fqga4qa"
CONFIG_PROFILE = "DEFAULT"
config = oci.config.from_file('~/.oci/config', profile_name=CONFIG_PROFILE)
###
ai_language_client = oci.ai_language.AIServiceLanguageClient(config)

Next, we can define what text we want to translate and what Language we want to translate the text into. In this case, I want to translate the text into French and to do so, we need to use the language abbreviation.

text_to_trans = "Hello. My name is Brendan and this is an example of using Oracle OCI Language translation service"
print(text_to_trans)
target_lang = "fr"

Next, we need to prepare the text and then send it to the translation service. Then, print the returned object

t_doc = oci.ai_language.models.TextDocument(key="Demo", text=text_to_trans, language_code="en")
trans_response = ai_language_client.batch_language_translation(oci.ai_language.models.BatchLanguageTranslationDetails(documents=[t_doc], target_language_code=target_lang))
print(trans_response.data)

The returned translated object is the following.

{
  "documents": [
    {
      "key": "Demo",
      "source_language_code": "en",
      "target_language_code": "fr",
      "translated_text": "Bonjour. Je m'appelle Brendan et voici un exemple d'utilisation du service de traduction Oracle OCI Language"
    }
  ],
  "errors": []
}

We can automate this process a little to automatically detect the input language. For example:

source_lang = ai_language_client.detect_dominant_language(oci.ai_language.models.DetectLanguageSentimentsDetails(text=text_to_trans))

t_doc = oci.ai_language.models.TextDocument(key="Demo", text=text_to_trans, language_code=source_lang.data.languages[0].code)
trans_response = ai_language_client.batch_language_translation(oci.ai_language.models.BatchLanguageTranslationDetails(documents=[t_doc], target_language_code=target_lang))
print(trans_response.data)

And we can also automate the translation into multiple different langues.

text_to_trans = "Hello. My name is Brendan and this is an example of using Oracle OCI Language translation service"
print(text_to_trans)
#target_lang = "fr"
target_lang = {"fr":"French", "nl":"Dutch", "de":"German", "it":"Italian", "ja":"Japaneese", "ko":"Korean", "pl":"polish"}

for lang_key, lang_name in target_lang.items():
    t_doc = oci.ai_language.models.TextDocument(key="Demo", text=text_to_trans, language_code="en")
    trans_response = ai_language_client.batch_language_translation(oci.ai_language.models.BatchLanguageTranslationDetails(documents=[t_doc], target_language_code=lang_key))
    ####
    print('    [' +  lang_name + '] ' + trans_response.data.documents[0].translated_text)

Hello. My name is Brendan and this is an example of using Oracle OCI Language translation service
    [French] Bonjour. Je m'appelle Brendan et voici un exemple d'utilisation du service de traduction Oracle OCI Language
    [Dutch] Hallo. Mijn naam is Brendan en dit is een voorbeeld van het gebruik van de Oracle OCI Language vertaalservice.
    [German] Hallo. Mein Name ist Brendan und dies ist ein Beispiel für die Verwendung des Oracle OCI Language-Übersetzungsservice
    [Italian] Ciao. Il mio nome è Brendan e questo è un esempio di utilizzo del servizio di traduzione di Oracle OCI Language
    [Japaneese] こんにちは。私の名前はBrendanで、これはOracle OCI Language翻訳サービスの使用例です
    [Korean] 안녕하세요. 내 이름은 브렌단이며 Oracle OCI 언어 번역 서비스를 사용하는 예입니다.
    [polish] Dzień dobry. Nazywam się Brendan i jest to przykład korzystania z usługi tłumaczeniowej OCI Language

This entry was posted in AI, OCI and tagged OCI AI Services, OCI Language, Translation.

How to access and use Multilingual embedding models using OML for Python 2.1

Posted on April 22, 2025

With the release of OML for Python 2.1, we have a newer approach for accessing and generating embedding models in ONNX format (from Hugging Face), which can then be loaded into the Database (>=23.7ai). Although with this newer approach, we need to be aware of the previous way of doing it (<=23.6). As of 23.7ai this previous approach is now deprecated. Given the short time the previous approach was around, things could yet change with subsequent releases.

As the popularity of Vector Search spreads, there have been some challenges relating to obtaining and using suitable models within a Database. Oracle 23ai introduced the ability to load an embedding model in ONNX format into the database. These could be easily used to create and update embedding vectors for your data. But there were some limitations and challenges with what ONNX models could be used, and sometimes this was limited to the formatting of the ONNX metadata and the pretrained models themselves.

To ease the process of locating, converting and loading an embedding model into an Oracle Database, Oracle Machine Learning for Python 2.1 has been released with some new features and methods to make the process slightly easier. There are three constraints for doing this: 1) you need to use Oracle Machine Learning for Python 2.1 (OML4Py 2.1), and 2) you need to use Oracle 23.7ai (or greater) Database (This later requirement is to allow for the expanded range of embedding models) and 3) Python 3.12 (or greater).

There are a few additional Python libraries needed, so check the documentation for more details on what the most recent version of these should be. But do be careful of the documentation as the version I followed there was some inconsistencies in different sections regarding the requirements and steps to follow for installation. Hopefully this will be tidied up by the time you read the documentation.

With OML4Py 2.1 there is a class called oml.utils.onnxpipeline, with a number of methods to support the exploration of preconfigured models. These include: from_preconfigured(), from_template() and show_preconfigured(). This last one displays the list of preconfigured models.

oml.ONNXPipelineConfig.show_preconfigured()
['sentence-transformers/all-mpnet-base-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/multi-qa-MiniLM-L6-cos-v1', 'sentence-transformers/distiluse-base-multilingual-cased-v2', 'sentence-transformers/all-MiniLM-L12-v2', 'BAAI/bge-small-en-v1.5', 'BAAI/bge-base-en-v1.5', 'taylorAI/bge-micro-v2', 'intfloat/e5-small-v2', 'intfloat/e5-base-v2', 'thenlper/gte-base', 'thenlper/gte-small', 'TaylorAI/gte-tiny', 'sentence-transformers/paraphrase-multilingual-mpnet-base-v2', 'intfloat/multilingual-e5-base', 'intfloat/multilingual-e5-small', 'sentence-transformers/stsb-xlm-r-multilingual', 'Snowflake/snowflake-arctic-embed-xs', 'Snowflake/snowflake-arctic-embed-s', 'Snowflake/snowflake-arctic-embed-m', 'mixedbread-ai/mxbai-embed-large-v1', 'openai/clip-vit-large-patch14', 'google/vit-base-patch16-224', 'microsoft/resnet-18', 'microsoft/resnet-50', 'WinKawaks/vit-tiny-patch16-224', 'Falconsai/nsfw_image_detection', 'WinKawaks/vit-small-patch16-224', 'nateraw/vit-age-classifier', 'rizvandwiki/gender-classification', 'AdamCodd/vit-base-nsfw-detector', 'trpakov/vit-face-expression', 'BAAI/bge-reranker-base']

We can inspect the full details of all models or for an individual model by changing some of the parameters.

oml.ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name='microsoft/resnet-50')
[{'checksum': 'fca7567354d0cb0b7258a44810150f7c1abf8e955cff9320b043c0554ba81f8e', 'model_type': 'IMAGE_CONVNEXT', 'pre_processors_img': [{'name': 'DecodeImage', 'do_convert_rgb': True}, {'name': 'Resize', 'enable': True, 'size': {'height': 384, 'width': 384}, 'resample': 'bilinear'}, {'name': 'Rescale', 'enable': True, 'rescale_factor': 0.00392156862}, {'name': 'Normalize', 'enable': True, 'image_mean': 'IMAGENET_STANDARD_MEAN', 'image_std': 'IMAGENET_STANDARD_STD'}, {'name': 'OrderChannels', 'order': 'CHW'}], 'post_processors_img': []}]

Let’s prepare a preconfigured model using the simplest approach. In this example we’ll use the 'sentence-transformers/multi-qa-MiniLM-L6-cos-v1' model. Start by defining a pipeline.

pipeline = oml.utils.ONNXPipeline(model_name="sentence-transformers/multi-qa-MiniLM-L6-cos-v1")

We now have the choice of loading it straight to your database using an already established OML connection.

pipeline.export2db("my_onnx_model_db")

or you can export the model to a file

pipeline.export2file("my_onnx_model", output_dir="/home/oracle/onnx")

When you check the file system, you’ll see something like the following. Depending on the model you exported and any configuration changes to the model, the file size will vary in size.

-rw-r--r--. 1 oracle oracle 90636512 Apr 22 12:00 my_onnx_model.onnx

Is you ran the export2file() function you’ll need to move that file to the database server. This will involve talking gently with the DBA to get their assistance. Once permission is obtained from the DBA, the ONNX file will be copied into a directory that is referenced from within the database. This is a simple task of defining an external file system directory where the file is located. In the example below the directory is referenced as ONNX_DIR. This is the reference defined in the database, which has a pointer to directory on the file system. The next task is to load the ONNX file into your Database schema.

BEGIN
   DBMS_VECTOR.LOAD_ONNX_MODEL(
        directory => 'ONNX_DIR',
        file_name => 'my_onnx_model.onnx',
        model_name => 'Multi-qa-MiniLM-L6-Cos-v1');
END;

Once loaded the model can be used to create Vector Embeddings.

This entry was posted in OML, OML4Py, Vector Embeddings and tagged OML, OML4Py, Vector Embeddings.

Unlock Text Analytics with Oracle OCI Python – Part 2

Posted on April 8, 2025 Updated on May 13, 2025

This is my second post on using Oracle OCI Language service to perform Text Analytics. These include Language Detection, Text Classification, Sentiment Analysis, Key Phrase Extraction, Named Entity Recognition, Private Data detection and masking, and Healthcare NLP.

In my Previous post (Part 1), I covered examples on Language Detection, Text Classification and Sentiment Analysis.

In this post (Part 2), I’ll cover:

Key Phrase
Named Entity Recognition
Detect private information and marking

Make sure you check out Part 1 for details on setting up the client and establishing a connection. These details are omitted in the examples below.

Key Phrase Extraction

With Key Phrase Extraction, it aims to identify the key works and/or phrases from the text. The keywords/phrases are selected based on what are the main topics in the text along with the confidence score. The text is parsed to extra the words/phrase that are important in the text. This can aid with identifying the key aspects of the document without having to read it. Care is needed as these words/phrases do not represent the meaning implied in the text.

Using some of the same texts used in Part-1, let’s see what gets generated for the text about a Hotel experience.

t_doc = oci.ai_language.models.TextDocument(
    key="Demo",
    text="This hotel is a bad place, I would strongly advise against going there. There was one helpful member of staff",
    language_code="en")

key_phrase = ai_language_client.batch_detect_language_key_phrases((oci.ai_language.models.BatchDetectLanguageKeyPhrasesDetails(documents=[t_doc])))

print(key_phrase.data)
print('==========')
for i in range(len(key_phrase.data.documents)):
        for j in range(len(key_phrase.data.documents[i].key_phrases)):
            print("phrase: ", key_phrase.data.documents[i].key_phrases[j].text +' [' + str(key_phrase.data.documents[i].key_phrases[j].score) + ']')

{
  "documents": [
    {
      "key": "Demo",
      "key_phrases": [
        {
          "score": 0.9998106383818767,
          "text": "bad place"
        },
        {
          "score": 0.9998106383818767,
          "text": "one helpful member"
        },
        {
          "score": 0.9944029848214838,
          "text": "staff"
        },
        {
          "score": 0.9849306609397931,
          "text": "hotel"
        }
      ],
      "language_code": "en"
    }
  ],
  "errors": []
}
==========
phrase:  bad place [0.9998106383818767]
phrase:  one helpful member [0.9998106383818767]
phrase:  staff [0.9944029848214838]
phrase:  hotel [0.9849306609397931]

The output from the Key Phrase Extraction is presented into two formats about. The first is the JSON object returned from the function, containing the phrases and their confidence score. The second (below the ==========) is a formatted version of the same JSON object but parsed to extract and present the data in a more compact manner.

The next piece of text to be examined is taken from an article on the F1 website about a change of divers.

text_f1 = "Red Bull decided to take swift action after Liam Lawsons difficult start to the 2025 campaign, demoting him to Racing Bulls and promoting Yuki Tsunoda to the senior team alongside reigning world champion Max Verstappen. F1 Correspondent Lawrence Barretto explains why… Sergio Perez had endured a painful campaign that saw him finish a distant eighth in the Drivers Championship for Red Bull last season – while team mate Verstappen won a fourth successive title – and after sticking by him all season, the team opted to end his deal early after Abu Dhabi finale."

t_doc = oci.ai_language.models.TextDocument(
    key="Demo",
    text=text_f1,
    language_code="en")

key_phrase = ai_language_client.batch_detect_language_key_phrases(oci.ai_language.models.BatchDetectLanguageKeyPhrasesDetails(documents=[t_doc]))
print(key_phrase.data)
print('==========')
for i in range(len(key_phrase.data.documents)):
        for j in range(len(key_phrase.data.documents[i].key_phrases)):
            print("phrase: ", key_phrase.data.documents[i].key_phrases[j].text +' [' + str(key_phrase.data.documents[i].key_phrases[j].score) + ']')

I won’t include all the output and the following shows the key phrases in the compact format

phrase:  red bull [0.9991468440416812]
phrase:  swift action [0.9991468440416812]
phrase:  liam lawsons difficult start [0.9991468440416812]
phrase:  2025 campaign [0.9991468440416812]
phrase:  racing bulls [0.9991468440416812]
phrase:  promoting yuki tsunoda [0.9991468440416812]
phrase:  senior team [0.9991468440416812]
phrase:  sergio perez [0.9991468440416812]
phrase:  painful campaign [0.9991468440416812]
phrase:  drivers championship [0.9991468440416812]
phrase:  red bull last season [0.9991468440416812]
phrase:  team mate verstappen [0.9991468440416812]
phrase:  fourth successive title [0.9991468440416812]
phrase:  all season [0.9991468440416812]
phrase:  abu dhabi finale [0.9991468440416812]
phrase:  team [0.9420016064526977]

While some aspects of this is interesting, care is needed to not overly rely upon it. It really depends on the usecase.

Named Entity Recognition

For Named Entity Recognition is a natural language process for finding particular types of entities listed as words or phrases in the text. The named entities are a defined list of items. For OCI Language there is a list available here. Some named entities have a sub entities. The return JSON object from the function has a format like the following.

{
  "documents": [
    {
      "entities": [
        {
          "length": 5,
          "offset": 5,
          "score": 0.969588577747345,
          "sub_type": "FACILITY",
          "text": "hotel",
          "type": "LOCATION"
        },
        {
          "length": 27,
          "offset": 82,
          "score": 0.897526216506958,
          "sub_type": null,
          "text": "one helpful member of staff",
          "type": "QUANTITY"
        }
      ],
      "key": "Demo",
      "language_code": "en"
    }
  ],
  "errors": []
}

For each named entity discovered the returned object will contain the Text identifed, the Entity Type, the Entity Subtype, Confidence Score, offset and length.

Using the text samples used previous, let’s see what gets produced. The first example is for the hotel review.

t_doc = oci.ai_language.models.TextDocument(
    key="Demo",
    text="This hotel is a bad place, I would strongly advise against going there. There was one helpful member of staff",
    language_code="en")

named_entities = ai_language_client.batch_detect_language_entities(
            batch_detect_language_entities_details=oci.ai_language.models.BatchDetectLanguageEntitiesDetails(documents=[t_doc]))

for i in range(len(named_entities.data.documents)):
        for j in range(len(named_entities.data.documents[i].entities)):
            print("Text: ", named_entities.data.documents[i].entities[j].text, ' [' + named_entities.data.documents[i].entities[j].type + ']'
                 + '[' + str(named_entities.data.documents[i].entities[j].sub_type) + ']' + '{offset:' 
                 + str(named_entities.data.documents[i].entities[j].offset) + '}')

Text:  hotel  [LOCATION][FACILITY]{offset:5}
Text:  one helpful member of staff  [QUANTITY][None]{offset:82}

The last two lines above are the formatted output of the JSON object. It contains two named entities. The first one is for the text “hotel” and it has a Entity Type of Location, and a Sub Entitity Type of Location. The second named entity is for a long piece of string and for this it has a Entity Type of Quantity, but has no Sub Entity Type.

Now let’s see what is creates for the F1 text. (the text has been given above and the code is very similar/same as above).

Text:  Red Bull  [ORGANIZATION][None]{offset:0}
Text:  swift  [ORGANIZATION][None]{offset:25}
Text:  Liam Lawsons  [PERSON][None]{offset:44}
Text:  2025  [DATETIME][DATE]{offset:80}
Text:  Yuki Tsunoda  [PERSON][None]{offset:138}
Text:  senior  [QUANTITY][AGE]{offset:158}
Text:  Max Verstappen  [PERSON][None]{offset:204}
Text:  F1  [ORGANIZATION][None]{offset:220}
Text:  Lawrence Barretto  [PERSON][None]{offset:237}
Text:  Sergio Perez  [PERSON][None]{offset:269}
Text:  campaign  [EVENT][None]{offset:304}
Text:  eighth in the  [QUANTITY][None]{offset:343}
Text:  Drivers Championship  [EVENT][None]{offset:357}
Text:  Red Bull  [ORGANIZATION][None]{offset:382}
Text:  Verstappen  [PERSON][None]{offset:421}
Text:  fourth successive title  [QUANTITY][None]{offset:438}
Text:  Abu Dhabi  [LOCATION][GPE]{offset:545}

Detect Private Information and Marking

The ability to perform data masking has been available in SQL for a long time. There are lots of scenarios where masking is needed and you are not using a Database or not at that particular time.

With Detect Private Information or Personal Identifiable Information the OCI AI function search for data that is personal and gives you options on how to present this back to the users. Examples of the types of data or Entity Types it will detect include Person, Adddress, Age, SSN, Passport, Phone Numbers, Bank Accounts, IP Address, Cookie details, Private and Public keys, various OCI related information, etc. The list goes on. Check out the documentation for more details on these. Unfortunately the documentation for how the Python API works is very limited.

The examples below illustrate some of the basic options. But there is lots more you can do with this feature like defining you own rules.

For these examples, I’m going to use the following text which I’ve assigned to a variable called text_demo.

Hi Martin. Thanks for taking my call on 1/04/2025. Here are the details you requested. My Bank Account Number is 1234-5678-9876-5432 and my Bank Branch is Main Street, Dublin. My Date of Birth is 29/02/1993 and I’ve been living at my current address for 15 years. Can you also update my email address to brendan.tierney@email.com. If toy have any problems with this you can contact me on +353-1-493-1111. Thanks for your help. Brendan.

m_mode = {"ALL":{"mode":'MASK'}} 

t_doc = oci.ai_language.models.TextDocument(key="Demo", text=text_demo,language_code="en")

pii_entities = ai_language_client.batch_detect_language_pii_entities(oci.ai_language.models.BatchDetectLanguagePiiEntitiesDetails(documents=[t_doc], masking=m_mode))

print(text_demo)
print('--------------------------------------------------------------------------------')
print(pii_entities.data.documents[0].masked_text)
print('--------------------------------------------------------------------------------')
for i in range(len(pii_entities.data.documents)):
        for j in range(len(pii_entities.data.documents[i].entities)):
            print("phrase: ", pii_entities.data.documents[i].entities[j].text +' [' + str(pii_entities.data.documents[i].entities[j].type) + ']')

Hi Martin. Thanks for taking my call on 1/04/2025. Here are the details you requested. My Bank Account Number is 1234-5678-9876-5432 and my Bank Branch is Main Street, Dublin. My Date of Birth is 29/02/1993 and I've been living at my current address for 15 years. Can you also update my email address to brendan.tierney@email.com. If toy have any problems with this you can contact me on +353-1-493-1111. Thanks for your help. Brendan.
--------------------------------------------------------------------------------
Hi ******. Thanks for taking my call on *********. Here are the details you requested. My Bank Account Number is ******************* and my Bank Branch is Main Street, Dublin. My Date of Birth is ********** and I've been living at my current address for ********. Can you also update my email address to *************************. If toy have any problems with this you can contact me on ***************. Thanks for your help. *******.
--------------------------------------------------------------------------------
phrase:  Martin [PERSON]
phrase:  1/04/2025 [DATE_TIME]
phrase:  1234-5678-9876-5432 [CREDIT_DEBIT_NUMBER]
phrase:  29/02/1993 [DATE_TIME]
phrase:  15 years [DATE_TIME]
phrase:  brendan.tierney@email.com [EMAIL]
phrase:  +353-1-493-1111 [TELEPHONE_NUMBER]
phrase:  Brendan [PERSON]

The above this the basic level of masking.

A second option is to use the REMOVE mask. For this, change the mask format to the following.

m_mode = {"ALL":{'mode':'REMOVE'}}

For this option the indentified information is removed from the text.

Hi . Thanks for taking my call on . Here are the details you requested. My Bank Account Number is  and my Bank Branch is Main Street, Dublin. My Date of Birth is  and I've been living at my current address for . Can you also update my email address to . If toy have any problems with this you can contact me on . Thanks for your help. .
--------------------------------------------------------------------------------
phrase:  Martin [PERSON]
phrase:  1/04/2025 [DATE_TIME]
phrase:  1234-5678-9876-5432 [CREDIT_DEBIT_NUMBER]
phrase:  29/02/1993 [DATE_TIME]
phrase:  15 years [DATE_TIME]
phrase:  brendan.tierney@email.com [EMAIL]
phrase:  +353-1-493-1111 [TELEPHONE_NUMBER]
phrase:  Brendan [PERSON]

For the REPLACE option we have.

m_mode = {"ALL":{'mode':'REPLACE'}}

Which gives us the following, where we can see the key information is removed and replace with the name of Entity Type.

Hi <PERSON>. Thanks for taking my call on <DATE_TIME>. Here are the details you requested. My Bank Account Number is <CREDIT_DEBIT_NUMBER> and my Bank Branch is Main Street, Dublin. My Date of Birth is <DATE_TIME> and I've been living at my current address for <DATE_TIME>. Can you also update my email address to <EMAIL>. If toy have any problems with this you can contact me on <TELEPHONE_NUMBER>. Thanks for your help. <PERSON>.
--------------------------------------------------------------------------------
phrase:  Martin [PERSON]
phrase:  1/04/2025 [DATE_TIME]
phrase:  1234-5678-9876-5432 [CREDIT_DEBIT_NUMBER]
phrase:  29/02/1993 [DATE_TIME]
phrase:  15 years [DATE_TIME]
phrase:  brendan.tierney@email.com [EMAIL]
phrase:  +353-1-493-1111 [TELEPHONE_NUMBER]
phrase:  Brendan [PERSON]

We can also change the character used for the masking. In this example we change the masking character to + symbol.

m_mode = {"ALL":{'mode':'MASK','maskingCharacter':'+'}}

Hi ++++++. Thanks for taking my call on +++++++++. Here are the details you requested. My Bank Account Number is +++++++++++++++++++ and my Bank Branch is Main Street, Dublin. My Date of Birth is ++++++++++ and I've been living at my current address for ++++++++. Can you also update my email address to +++++++++++++++++++++++++. If toy have any problems with this you can contact me on +++++++++++++++. Thanks for your help. +++++++.

I mentioned at the start of this section there was lots of options available to you, including defining your own rules, using regular expressions, etc. Let me know if you’re interested in exploring some of these and I can share a few more examples.

This entry was posted in AI, OCI, Python and tagged AI, OCI, OCI AI Services, OCI Language.

Where to find Blogs on Oracle Database

Posted on April 3, 2025

A regular question I get asked is, “is there a list of Oracle-related Blogs?” and “what people/blogs should I follow to learn more about Oracle Database?” This typically gets asked by people in the early stages of their careers and even by those who have been around for a while.

You could ask these questions to twenty different (experienced) people, and you’d get largely the same answers with some variations. These variations would be down to their preferences on how certain people cover certain topics. This comes down to experience of following lots of people and learning over time.

But where can people get started quickly with a list of 100 blogs. Feedspot is one such place where you can find such a place and subscribe for update emails when new blog posts are written.

Go check out the list. You might discover some new bloggers and content.

This entry was posted in Machine Learning.

Unlock Text Analytics with Oracle OCI Python – Part 1

Posted on April 1, 2025 Updated on March 28, 2025

Oracle OCI has a number of features that allows you to perform Text Analytics such as Language Detection, Text Classification, Sentiment Analysis, Key Phrase Extraction, Named Entity Recognition, Private Data detection and masking, and Healthcare NLP.

While some of these have particular (and in some instances limited) use cases, the following examples will illustrate some of the main features using the OCI Python library. Why am I using Python to illustrate these? This is because most developers are using Python to build applications.

In this post, the Python examples below will cover the following:

Language Detection
Text Classification
Sentiment Analysis

In my next post on this topic, I’ll cover:

Key Phrase
Named Entity Recognition
Detect private information and marking

Before you can use any of the OCI AI Services, you need to set up a config file on your computer. This will contain the details necessary to establish a secure connection to your OCI tendency. Check out this blog post about setting this up.

The following Python examples illustrate what is possible for each feature. In the first example, I include what is needed for the config file. This is not repeated in the examples that follow, but it is still needed.

Language Detection

Let’s begin with a simple example where we provide a simple piece of text and as OCI Language Service, using OCI Python, to detect the primary language for the text and display some basic information about this prediction.

import oci
from oci.config import from_file

#Read in config file - this is needed for connecting to the OCI AI Services
CONFIG_PROFILE = "DEFAULT"
config = oci.config.from_file('~/.oci/config', profile_name=CONFIG_PROFILE)
###

ai_language_client = oci.ai_language.AIServiceLanguageClient(config)

# French : 
text_fr = "Bonjour et bienvenue dans l'analyse de texte à l'aide de ce service cloud"

response = ai_language_client.detect_dominant_language(
    oci.ai_language.models.DetectLanguageSentimentsDetails(
        text=text_fr
    )
)

print(response.data.languages[0].name)
----------
French

In this example, I’ve a simple piece of French (for any native French speakers, I do apologise). We can see the language was identified as French. Let’s have a closer look at what is returned by the OCI function.

print(response.data)
----------
{
  "languages": [
    {
      "code": "fr",
      "name": "French",
      "score": 1.0
    }
  ]
}

We can see from the above, the object contains the language code, the full name of the language and the score to indicate how strong or how confident the function is with the prediction. When the text contains two or more languages, the function will return the primary language used.

Note: OCI Language can detect at least 113 different languages. Check out the full list here.

Let’s give it a try with a few other languages, including Irish, which localised to certain parts of Ireland. Using the same code as above, I’ve included the same statement (google) translated into other languages. The code loops through each text statement and detects the language.

import oci
from oci.config import from_file

###
CONFIG_PROFILE = "DEFAULT"
config = oci.config.from_file('~/.oci/config', profile_name=CONFIG_PROFILE)
###

ai_language_client = oci.ai_language.AIServiceLanguageClient(config)

# French : 
text_fr = "Bonjour et bienvenue dans l'analyse de texte à l'aide de ce service cloud"
# German:
text_ger = "Guten Tag und willkommen zur Textanalyse mit diesem Cloud-Dienst"
# Danish
text_dan = "Goddag, og velkommen til at analysere tekst ved hjælp af denne skytjeneste"
# Italian
text_it = "Buongiorno e benvenuti all'analisi del testo tramite questo servizio cloud"
# English:
text_eng = "Good day, and welcome to analysing text using this cloud service"
# Irish
text_irl = "Lá maith, agus fáilte romhat chuig anailís a dhéanamh ar théacs ag baint úsáide as an tseirbhís scamall seo"

for text in [text_eng, text_ger, text_dan, text_it, text_irl]:
    response = ai_language_client.detect_dominant_language(
        oci.ai_language.models.DetectLanguageSentimentsDetails(
            text=text
        )
    )
    print('[' + response.data.languages[0].name + ' ('+ str(response.data.languages[0].score) +')' + '] '+ text)

----------
[English (1.0)] Good day, and welcome to analysing text using this cloud service
[German (1.0)] Guten Tag und willkommen zur Textanalyse mit diesem Cloud-Dienst
[Danish (1.0)] Goddag, og velkommen til at analysere tekst ved hjælp af denne skytjeneste
[Italian (1.0)] Buongiorno e benvenuti all'analisi del testo tramite questo servizio cloud
[Irish (1.0)] Lá maith, agus fáilte romhat chuig anailís a dhéanamh ar théacs ag baint úsáide as an tseirbhís scamall seo

When you run this code yourself, you’ll notice how quick the response time is for each.

Text Classification

Now that we can perform some simple language detections, we can move on to some more insightful functions. The first of these is Text Classification. With Text Classification, it will analyse the text to identify categories and a confidence score of what is covered in the text. Let’s have a look at an example using the English version of the text used above. This time, we need to perform two steps. The first is to set up and prepare the document to be sent. The second step is to perform the classification.

### Text Classification
text_document = oci.ai_language.models.TextDocument(key="Demo", text=text_eng, language_code="en")
text_class_resp = ai_language_client.batch_detect_language_text_classification(
            batch_detect_language_text_classification_details=oci.ai_language.models.BatchDetectLanguageTextClassificationDetails(
                documents=[text_document]
            )
        )
print(text_class_resp.data)
----------
{
  "documents": [
    {
      "key": "Demo",
      "language_code": "en",
      "text_classification": [
        {
          "label": "Internet and Communications/Web Services",
          "score": 1.0
        }
      ]
    }
  ],
  "errors": []
}

We can see it has correctly identified the text is referring to or is about “Internet and Communications/Web Services”. For a second example, let’s use some text about F1. The following is taken from an article on F1 app and refers to the recent Driver issues, and we’ll use the first two paragraphs.

{
  "documents": [
    {
      "key": "Demo",
      "language_code": "en",
      "text_classification": [
        {
          "label": "Sports and Games/Motor Sports",
          "score": 1.0
        }
      ]
    }
  ],
  "errors": []
}

We can format this response object as follows.

print(text_class_resp.data.documents[0].text_classification[0].label 
      + ' [' + str(text_class_resp.data.documents[0].text_classification[0].score) + ']')
----------
Sports and Games/Motor Sports [1.0]

It is possible to get multiple classifications being returned. To handle this we need to use a couple of loops.

for i in range(len(text_class_resp.data.documents)):
        for j in range(len(text_class_resp.data.documents[i].text_classification)):
            print("Label: ", text_class_resp.data.documents[i].text_classification[j].label)
            print("Score: ", text_class_resp.data.documents[i].text_classification[j].score)
----------
Label:  Sports and Games/Motor Sports
Score:  1.0

Yet again, it correctly identified the type of topic area for the text. At this point, you are probably starting to get ideas about how this can be used and in what kinds of scenarios. This list will probably get longer over time.

Sentiement Analysis

For Sentiment Analysis we are looking to gauge the mood or tone of a text. For example, we might be looking to identify opinions, appraisals, emotions, attitudes towards a topic or person or an entity. The function returned an object containing a positive, neutral, mixed and positive sentiments and a confidence score. This feature currently supports English and Spanish.

The Sentiment Analysis function provides two way of analysing the text:

At a Sentence level
Looks are certain Aspects of the text. This identifies parts/words/phrase and determines the sentiment for each

Let’s start with the Sentence level Sentiment Analysis with a piece of text containing two sentences with both negative and positive sentiments.

#Sentiment analysis
text = "This hotel was in poor condition and I'd recommend not staying here. There was one helpful member of staff"

text_document = oci.ai_language.models.TextDocument(key="Demo", text=text, language_code="en")
text_doc=oci.ai_language.models.BatchDetectLanguageSentimentsDetails(documents=[text_document])

text_sentiment_resp = ai_language_client.batch_detect_language_sentiments(text_doc, level=["SENTENCE"])

print (text_sentiment_resp.data)

The response object gives us:

{
  "documents": [
    {
      "aspects": [],
      "document_scores": {
        "Mixed": 0.3458947,
        "Negative": 0.41229093,
        "Neutral": 0.0061426135,
        "Positive": 0.23567174
      },
      "document_sentiment": "Negative",
      "key": "Demo",
      "language_code": "en",
      "sentences": [
        {
          "length": 68,
          "offset": 0,
          "scores": {
            "Mixed": 0.17541811,
            "Negative": 0.82458186,
            "Neutral": 0.0,
            "Positive": 0.0
          },
          "sentiment": "Negative",
          "text": "This hotel was in poor condition and I'd recommend not staying here."
        },
        {
          "length": 37,
          "offset": 69,
          "scores": {
            "Mixed": 0.5163713,
            "Negative": 0.0,
            "Neutral": 0.012285227,
            "Positive": 0.4713435
          },
          "sentiment": "Mixed",
          "text": "There was one helpful member of staff"
        }
      ]
    }
  ],
  "errors": []
}

There are two parts to this object. The first part gives us the overall Sentiment for the text, along with the confidence scores for all possible sentiments. The second part of the object breaks the test into individual sentences and gives the Sentiment and confidence scores for the sentence. Overall, the text used in “Negative” with a confidence score of 0.41229093. When we look at the sentences, we can see the first sentence is “Negative” and the second sentence is “Mixed”.

When we switch to using Aspect we can see the difference in the response.

text_sentiment_resp = ai_language_client.batch_detect_language_sentiments(text_doc, level=["ASPECT"])

print (text_sentiment_resp.data)

The response object gives us:

{
  "documents": [
    {
      "aspects": [
        {
          "length": 5,
          "offset": 5,
          "scores": {
            "Mixed": 0.17299445074935532,
            "Negative": 0.8268503302365734,
            "Neutral": 0.0,
            "Positive": 0.0001552190140712097
          },
          "sentiment": "Negative",
          "text": "hotel"
        },
        {
          "length": 9,
          "offset": 23,
          "scores": {
            "Mixed": 0.0020200687053503,
            "Negative": 0.9971282906307877,
            "Neutral": 0.0,
            "Positive": 0.0008516406638620019
          },
          "sentiment": "Negative",
          "text": "condition"
        },
        {
          "length": 6,
          "offset": 91,
          "scores": {
            "Mixed": 0.0,
            "Negative": 0.002300517913679934,
            "Neutral": 0.023815747524769032,
            "Positive": 0.973883734561551
          },
          "sentiment": "Positive",
          "text": "member"
        },
        {
          "length": 5,
          "offset": 101,
          "scores": {
            "Mixed": 0.10319573538533408,
            "Negative": 0.2070680870320537,
            "Neutral": 0.0,
            "Positive": 0.6897361775826122
          },
          "sentiment": "Positive",
          "text": "staff"
        }
      ],
      "document_scores": {},
      "document_sentiment": "",
      "key": "Demo",
      "language_code": "en",
      "sentences": []
    }
  ],
  "errors": []
}

The different aspects are extracted, and the sentiment for each within the text is determined. What you need to look out for are the labels “text” and “sentiment.

This entry was posted in AI, OCI, Python and tagged AI, OCI, OCI AI Services, OCI Language.

python-oracledb driver version 3 – load data into pandas df

Posted on March 20, 2025

The Python Oracle driver had a new release recently (version 3) and with it comes a new way to load data from a Table into a Pandas dataframe. This can now be done using the pyarrow library. Here’s an example:

import oracledb ora
import pyarrow py
import pandas

#create a connection to the database
con = ora.connect( <enter your connection details> )

query = "select cust_id, cust_first_name, cust_last_name, cust_city from customers"

#get Oracle DF and set array size - care is needed for setting this
ora_df = con.fetch_df_all(statement=query, arraysize=2000)

#run query and return into Pandas Dataframe
#  using pyarrow and the to_pandas() function
df = py.Table.from_arrays(ora_df.column_arrays(), names=ora_df.columns()).to_pandas()

print(df.columns)

Once you get used to the syntax it is a simpler way to get the data into dataframe.

This entry was posted in Oracle, Python and tagged oracledb, Python.

BOCAS – using OCI GenAI Agent and Stremlit

Posted on March 10, 2025

BOCAS stands for Brendan’s Oracle Chatbot Agent for Shakespeare. I’ve previously posted on how to go about creating a GenAI Agent on a specific data set. In this post, I’ll share code on how I did this using Python Streamlit.

And here’s the code

import streamlit as st
import time
import oci
from oci import generative_ai_agent_runtime
import json


# Page Title
welcome_msg = "Welcome to BOCAS."
welcome_msg2 = "This is Brendan's Oracle Chatbot Agent for Skakespeare. Ask questions about the works of Shakespeare."
st.title(welcome_msg) 
 
# Sidebar Image
st.sidebar.header("BOCAS")
st.sidebar.image("bocas-3.jpg", use_column_width=True) 
#with st.sidebar:
#    with st.echo:
#        st.write(welcome_msg2)
st.sidebar.markdown(welcome_msg2)
st.sidebar.markdown("The above image above was generated by Copilot using the following prompt.  generate an image icon for a chatbot called BOCAS which means Brendan's Oracle Chat Agent for Shakespeare, add BOCAS to image, Add a modern twist to Shakespeare's elements")
 
st.sidebar.write("")
st.sidebar.write("")
st.sidebar.write("")
st.sidebar.image("https://media.shakespeare.org.uk/images/SBT_SR_OS_37_Shakespeare_Firs.ec42f390.fill-1200x600-c75.jpg")
link="This image is from the [Shakespeare Trust website](https://media.shakespeare.org.uk/images/SBT_SR_OS_37_Shakespeare_Firs.ec42f390.fill-1200x600-c75.jpg)"
st.sidebar.write(link,unsafe_allow_html=True)

# OCI GenAI settings
CONFIG_PROFILE = "DEFAULT"
config = oci.config.from_file('~/.oci/config', CONFIG_PROFILE)
###
SERVICE_EP = <your service endpoint> 
AGENT_EP_ID = <your agent endpoint>
###
 
# Response Generator
def response_generator(text_input):
    #Initiate AI Agent runtime client
    genai_agent_runtime_client = generative_ai_agent_runtime.GenerativeAiAgentRuntimeClient(config, service_endpoint=SERVICE_EP, retry_strategy=oci.retry.NoneRetryStrategy())

    create_session_details = generative_ai_agent_runtime.models.CreateSessionDetails()
    create_session_details.display_name = "Welcome to BOCAS"
    create_session_details.idle_timeout_in_seconds = 20
    create_session_details.description = welcome_msg

    create_session_response = genai_agent_runtime_client.create_session(create_session_details, AGENT_EP_ID)

    #Define Chat details and input message/question
    session_details = generative_ai_agent_runtime.models.ChatDetails()
    session_details.session_id = create_session_response.data.id
    session_details.should_stream = False
    session_details.user_message = text_input

    #Get AI Agent Respose
    session_response = genai_agent_runtime_client.chat(agent_endpoint_id=AGENT_EP_ID, chat_details=session_details)
 
    #print(str(response.data))
    response = session_response.data.message.content.text
    return response
 
# Initialize chat history
if "messages" not in st.session_state:
    st.session_state.messages = []
 
# Display chat messages from history on app rerun
for message in st.session_state.messages:
    with st.chat_message(message["role"]):
        st.markdown(message["content"])
 
# Accept user input
if prompt := st.chat_input("How can I help?"):
    # Add user message to chat history
    st.session_state.messages.append({"role": "user", "content": prompt})
    # Display user message in chat message container
    with st.chat_message("user"):
        st.markdown(prompt)
 
    # Display assistant response in chat message container
    with st.chat_message("assistant"):
        response = response_generator(prompt)
        write_response = st.write(response)
        st.session_state.messages.append({"role": "ai", "content": response})
    # Add assistant response to chat history

This entry was posted in Artificial Intelligence, LLM, OCI and tagged GenAI Agent, OCI, Python, Streamlit.

Calling Custom OCI Gen AI Agent using Python

Posted on February 24, 2025 Updated on February 22, 2025

In a previous post, I demonstrated how to create a custom Generative AI Agent on OCI. This GenAI Agent was built using some of Shakespeare’s works. Using the OCI GenAI Agent interface is an easy way to test the Agent and to see how it behaves. Beyond that, it doesn’t have any use as you’ll need to call it using some other language or tool. The most common of these is using Python.

The code below calls my GenAI Agent, which I’ve called BOCAS (Brendan’s Oracle Chat Agent for Shakespeare).

import oci
from oci import generative_ai_agent_runtime
import json
from colorama import Fore, Back, Style


CONFIG_PROFILE = "DEFAULT"
config = oci.config.from_file('~/.oci/config', CONFIG_PROFILE)

#AI Agent service endpoint
SERVICE_EP = <add your Service Endpoint> 
AGENT_EP_ID = <add your GenAI Agent Endpoint>

welcome_msg = "This is Brendan's Oracle Chatbot Agent for Shakespeare. Ask questions about the works of Shakespeare."
    
def gen_Agent_Client():
    #Initiate AI Agent runtime client
    genai_agent_runtime_client = generative_ai_agent_runtime.GenerativeAiAgentRuntimeClient(config, service_endpoint=SERVICE_EP, retry_strategy=oci.retry.NoneRetryStrategy())

    create_session_details = generative_ai_agent_runtime.models.CreateSessionDetails()
    create_session_details.display_name = "Welcome to BOCAS"
    create_session_details.idle_timeout_in_seconds = 20
    create_session_details.description = welcome_msg

    return create_session_details, genai_agent_runtime_client

def Quest_Answer(user_question, create_session_details, genai_agent_runtime_client):
    #Create a Chat Session for AI Agent
    try:
        create_session_response = genai_agent_runtime_client.create_session(create_session_details, AGENT_EP_ID)
    except:
        create_session_details, genai_agent_runtime_client = gen_Agent_Client()
        create_session_response = genai_agent_runtime_client.create_session(create_session_details, AGENT_EP_ID)
    
    #Define Chat details and input message/question
    session_details = generative_ai_agent_runtime.models.ChatDetails()
    session_details.session_id = create_session_response.data.id
    session_details.should_stream = False
    session_details.user_message = user_question

    #Get AI Agent Respose
    session_response = genai_agent_runtime_client.chat(agent_endpoint_id=AGENT_EP_ID, chat_details=session_details)
    return session_response

print(Style.BRIGHT + Fore.RED + welcome_msg + Style.RESET_ALL)

ses_details, genai_client = gen_Agent_Client()

while True:
    question = input("Enter text (or Enter to quit): ")
    if not question:
        break
    chat_response = Quest_Answer(question, ses_details, genai_client)
    print(Style.DIM +'********** Question for BOCAS **********')
    print(Style.BRIGHT + Fore.RED + question + Style.RESET_ALL)
    print(Style.DIM + '********** Answer from BOCAS **********' + Style.RESET_ALL)
    print(Fore.MAGENTA + chat_response.data.message.content.text + Style.RESET_ALL)

print("*** The End - Exiting BOCAS ***")

When the above code is run, it will loop, asking for questions, until no question is added and the ‘Enter’ key is pressed. Here is the output of the BOCAS running for some of the questions I asked in my previous post, along with a few others. These questions are based on the Irish Leaving Certificate English Examination.

This entry was posted in AI, Artificial Intelligence, LLM, OCI and tagged AI Agent, education, english, learning, Leaving Certificate, LLM, teachers.

OCI Gen AI – How to call using Python

Posted on February 11, 2025

Oracle OCI has some Generative AI features, one of which is a Playground allowing you to play or experiment with using several of the Cohere models. The Playground includes Chat, Generation, Summarization and Embedding.

OCI Generative AI services are only available in a few Cloud Regions. You can check the available regions in the documentation. A simple way to check if it is available in your cloud account is to go to the menu and see if it is listed in the Analytics & AI section.

When the webpage opens you can select the Playground from the main page or select one of the options from the menu on the right-hand-side of the page. The following image shows this menu and in this image, I’ve selected the Chat option.

You can enter your questions into the chat box at the bottom of the screen. In the image, I’ve used the following text to generate a Retirement email.

A university professor has decided to retire early. write and email to faculty management and HR of his decision. The job has become very stressful and without proper supports I cannot continue in the role.  write me an email for this

Using this playground is useful for trying things out and to see what works and doesn’t work for you. When you are ready to use or deploy such a Generative AI solution, you’ll need to do so using some other coding environment. If you look toward the top right hand corner of this playground page, you’ll see a ‘View code’ button. When you click on this Code will be generated for you in Java and Python. You can copy and paste this to any environment and quickly have a Chatbot up and running in few minutes. I was going to say a few second but you do need to setup a .config file to setup a secure connection to your OCI account. Here is a blog post I wrote about setting this up.

Here is a copy of that Python code with some minor edits, 1) to remove my Compartment ID, 2) I’ve added some message requests. You can comment/uncomment as you like or add something new.

import oci

# Setup basic variables
# Auth Config
# TODO: Please update config profile name and use the compartmentId that has policies grant permissions for using Generative AI Service
compartment_id =  <add your Compartment ID>
CONFIG_PROFILE = "DEFAULT"
config = oci.config.from_file('~/.oci/config', CONFIG_PROFILE)

# Service endpoint
endpoint = "https://inference.generativeai.us-chicago-1.oci.oraclecloud.com"

generative_ai_inference_client = oci.generative_ai_inference.GenerativeAiInferenceClient(config=config, service_endpoint=endpoint, retry_strategy=oci.retry.NoneRetryStrategy(), timeout=(10,240))
chat_detail = oci.generative_ai_inference.models.ChatDetails()

chat_request = oci.generative_ai_inference.models.CohereChatRequest()
#chat_request.message = "Tell me what you can do?"
#chat_request.message = "How does GenAI work?"
chat_request.message = "What's the weather like today where I live?"
chat_request.message = "Could you look it up for me?"
chat_request.message = "Will Elon Musk buy OpenAI?"
chat_request.message = "Tell me about Stargate Project and how it will work?"
chat_request.message = "What is the most recent date your model is built on?"


chat_request.max_tokens = 600
chat_request.temperature = 1
chat_request.frequency_penalty = 0
chat_request.top_p = 0.75
chat_request.top_k = 0
chat_request.seed = None


chat_detail.serving_mode = oci.generative_ai_inference.models.OnDemandServingMode(model_id="ocid1.generativeaimodel.oc1.us-chicago-1.amaaaaaask7dceyanrlpnq5ybfu5hnzarg7jomak3q6kyhkzjsl4qj24fyoq")
chat_detail.chat_request = chat_request
chat_detail.compartment_id = compartment_id
chat_response = generative_ai_inference_client.chat(chat_detail)
# Print result
print("**************************Chat Result**************************")
print(vars(chat_response))

When I run the above code I get the following output.

NB: If you have the OCI Python package already installed you might need to update it to the most recent version

You can see there is a lot generated and returned in the response. We can tidy this up a little using the following and only display the response message.

import json
# Convert JSON output to a dictionary
data = chat_response.__dict__["data"]
output = json.loads(str(data))
 
# Print the output
print("---Message Returned by LLM---")
print(output["chat_response"]["chat_history"][1]["message"])

That’s it. Give it a try and see how you can build it into your applications.

This entry was posted in AI, Artificial Intelligence, LLM and tagged cloud, LLMs, OCI, oracle cloud.

← Previous
1
2
3
…
57
Next →

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: