As an engineer’s blog, I’d like to share what I’m learning on a daily basis. This time, I’ll introduce the popular video generation tool Sora2.
■ From Books to AI──The Evolution of Information Retrieval
In the old days, learning a skill meant frequenting bookstores and libraries. It was an era of poring over thick technical manuals and taking time to understand them. Then the spread of the internet gave birth to a culture of “searching,” and soon “Googling” became commonplace. And now— People have entered the era of “asking AI.” In just a few decades, our relationship with knowledge has evolved this far.
■ How are you all using it?
Many people have tried out AI tools like ChatGPT, image generation AI, and voice synthesis tools. At Dandelions Japan, we’re already leveraging AI for video production for e-commerce sites. By incorporating AI-assisted video generation, we’ve significantly improved production costs and speed. AI is no longer just “research”—it’s become a “tool for the field.” And its adoption will only accelerate from here.
■ The Impact of SORA2’s “Life-Like Videos”
OpenAI’s “SORA2” is a technology that generates realistic videos from text. It has reached a level where it can create footage of people speaking and moving naturally from a single still image. For example, ・A speech video that does not actually exist ・Unfilmed “dance” scenes ・A video that seems to speak to you, based on photographs of the deceased ──These can be generated with just a few lines of text. Technologically astonishing, yet simultaneously highlighting issues of copyright and portrait rights. “Even though it’s based only on the person’s image, a video that looks exactly like them is generated.” The boundary between reality and virtuality is becoming more ambiguous than ever before.
■ How to Leverage AI Videos?
When used correctly, this technology opens up endless possibilities for creativity. ・Historical Reenactment in Education and Museum Fields ・Sign language and language learning support ・Streamlining Product Introduction and Advertising Videos ・Artist Virtual Appearance We are also gradually incorporating AI-generated content into product description videos and promotional footage on our e-commerce site, Dandelions Japan.
Video uploaded to the official website
■ Three Proposals for Effective Utilization
Subject to the consent of the individual or creator AI-generated content must prioritize obtaining consent from the rights holders of the source material.
Disclose the generation of content By explicitly stating “This video was generated by AI,” we maintain transparency and trustworthiness.
Limited to use for educational, research, and expressive purposes Not for the purpose of faking, but solely as a “creative assistive technology.”
■ Summary: AI’s value changes based on “who uses it and how”
AI technology is evolving daily. “Hyper-realistic videos” like SORA2 expand creative possibilities while demanding new ethical frameworks for society as a whole. What’s required of us creators and businesses isn’t choosing “not to use” it, but adopting an attitude of designing “how to use it.” Rather than fearing AI, we should strive to understand it correctly and explore ways to utilize it for the future’s benefit.
As an engineer’s blog, I’d like to share what I’m learning on a daily basis. This time, the theme is AI Agents, which have been gaining significant attention in recent years.
An AI Agent is an autonomous system capable of recognizing its surroundings, making judgments based on the situation, and taking action. Unlike ordinary automation, its defining feature is the ability to flexibly handle tasks using learning and reasoning. Recently, examples of incorporating AI Agents into work and system development using frameworks like ChatGPT and LangChain are increasing.
Translated with DeepL.com (free version)
1.Basic Elements of an AI Agent
The AI Agent consists of the following four elements.
Perception: Recognizing the environment and data (e.g., text analysis, sensor information)
Reasoning: Making decisions based on knowledge and rules
Action: Execute a task (e.g., API call, document generation)
Learning: Accumulating experience and improving
Traditional programs only performed predetermined tasks, but AI Agents can flexibly adapt their actions based on the situation.
2.Representative Frameworks
① LangChain
Features: Utilizes LLM (Large Language Model) to decompose and execute complex tasks.
Strengths: Easy API integration, enabling automatic combination of search, calculation, and information organization.
Usage examples: Document search agent, automated customer support responses.
② AutoGPT
Features: When a goal is set, it autonomously plans and executes multiple tasks.
Strength: Can determine next steps independently with minimal direction from others.
Use cases: Automating research tasks, drafting blog posts.
③ BabyAGI
Features: A compact autonomous agent that utilizes memory to continuously manage tasks.
Strengths: Simple and lightweight, suitable for personal use and small-scale projects.
Use cases: Daily to-do list management, automating information organization.
3.Use Cases
Software Development: Support from Requirements Definition to Test Automation
Business Operations: Streamlining customer support, scheduling, and research tasks
Daily Life: Smart Home Control, Learning Support
These are not merely “tools,” but are expected to be “partners” that collaborate with humans to achieve results.
4.Future Outlook and Challenges
Outlook: Advancement toward “multi-agent systems” where multiple AI agents collaborate to solve problems as a team.
Challenges: Reliability, security, and ethical concerns (risks of misinformation and excessive autonomy).
This time, we introduced the basics of AI Agents and their potential applications. Next time, we plan to share our hands-on experience building an AI Agent using LangChain and having it perform simple tasks, along with our impressions of using it. We’ll include code examples and results to explain, based on our experience, how it can be applied to business operations.
If you’re interested in AI Agents, be sure to tune in next time.
技術者ブログとして日ごろ取り組んでいる学習内容をご紹介します。 今回は、近年注目を集めている AI Agent をテーマにしています。
AI Agentは、周りの環境を認識し、その状況に合わせて判断し、行動できる自律的なシステムです。 普通の自動化とは違い、学習や推論を使って柔軟にタスクをこなせるのが特徴です。 最近は、ChatGPTやLangChainなどのフレームワークを使って、仕事やシステム開発に取り入れられる例が増えています。
I introduce the learning topics I’m working on daily in my engineer blog. Recently, I personally created a Discord bot and successfully ran it for free, 24/7, so I’d like to share that experience with you this time.
What’s Discord?
Before diving into the main topic, let me give a brief overview of Discord. Discord is a communication service developed in the United States that supports text, voice, and video interactions. I mainly use it as a call app while working.
Here are the main features of Discord (only those relevant to the main topic have been extracted)
◆ Server A server is a feature similar to group chats, allowing multiple users to interact through text chats and voice communication. Servers are categorized into Open Servers (which can be made publicly available through application, searchable within Discord, or announced on bulletin sites for recruiting members) and Private Servers (accessible only by invitation unless applied for).
◆ Text Channel This is a group chat feature within a server where members can communicate with each other through text.
◆ Voice Channel This is a group call feature within a server, allowing members in the channel to communicate and share their screens.
◆ BOT By introducing a BOT, you can enhance the functionality of your server. Inviting publicly available BOTs allows you to add useful features like music playback, text-to-speech, timers, and schedule management. You can also invite and operate your own custom-made BOT.
From Creating to Operating a BOT
The BOT I created this time is a ‘Voice Channel Entry Notification BOT’ for private servers.
◆ Inspiration In Discord, when someone enters a voice channel, there is no built-in feature to notify other server members. As a result, someone might be waiting without realizing they’re working alone. That’s when I thought, ‘Wouldn’t it be useful to have a BOT that sends notifications for channel entries?’ And so, I decided to create one.
◆ Implementation Details ①Determine the BOT’s features and coding Create a main.js file in JavaScript to align with sites that allow free 24/7 BOT operation.
Features included:
When the number of users in the ‘Working’ voice channel changes from 0 to 1, automatically send a recruitment message to the ‘Call Recruitment (Auto)’ text channel.
If the number drops from 1 to 0, delete the recruitment messages previously sent to ‘Call Recruitment (Auto)’ entirely.
From 1 to 2 or more users, no recruitment messages are sent to ‘Call Recruitment (Auto)’.
Below is part of the actual coding.
② Create a BOT Account on Discord Developer Portal Access the Discord Developer Portal, create a BOT account from ‘New Application,’ and obtain the token.
Enable the following in the BOT’s permission settings.
After configuring the BOT’s permissions, invite the BOT to your server using the generated URL.
③ Create a Repository on GitHub and Upload the Source Code I used Git, which I’ve previously introduced on my engineer blog and discussed during an internal workshop. Create a repository on GitHub and upload the source code.
④ Link Deno.deploy with GitHub and Log In Create a project in Deno.deploy and import the repository created on GitHub. Set the token obtained in step ② as an environment variable.
⑤ Add Deno.cron at the End of the Source Code to Enable 24/7 Operation To ensure the BOT doesn’t stop, add a periodic process (Cron job) at the end of the source code that performs light operations every 2 minutes on Deno.deploy.. This allows the BOT to operate 24/7.
⑥ Test on the Discord Server Enter the voice channel and check if it operates as expected.
When the voice channel has no members, the text channel has no messages.When you join the voice channel and the number of members in the voice channel changes from 0 to 1, a call recruitment message is sent to the text channel.When you leave the voice channel and the number of members in the voice channel decreases from 1 to 0, the message in the text channel is automatically deleted.
It has been confirmed to be working as expected, so it will be completed once the BOT is checked to ensure it is not offline the next morning.
That wraps up the discussion on creating a Discord BOT and achieving free 24-hour operation.
Let us introduce the learning topics we work on daily as part of our engineer blog. This time, it’s Team 2, focusing on low-code development!
This time’s theme: (Outsystems) Calling External APIs
In Outsystems, there are several ways to use externally published APIs. This time, we’ll focus on introducing the method for calling REST-style APIs.
The API we will use this time
This time, we’ll use an example of calling ‘Completions,’ one of the APIs provided by OpenAI (link: API reference). It’s the API that powers ‘ChatGPT,’ enabling the functionality where users send a request and the AI responds with a written answer. Many of you might have seen or used it before, haven’t you?
Implementation Example
This time, we used this API to create a feature where AI evaluates Japanese reports submitted by users and provides feedback on the content, including assessments and suggestions for improvement.
The layout of the API execution screen
The layout is quite simple: users input text into the ‘Report Input Field’ on the left side of the screen. By pressing the ‘Evaluate’ button, evaluation results and improvement feedback are displayed in the ‘Feedback’ section on the right.
Action upon pressing the button”GetFeedback” Action
The process upon pressing the button involves re-executing the REST API set within the Data Action. The flow includes setting parameters for the request required for API execution and then executing the API itself. So, how do you configure the API request here? By providing a sample request in Outsystems, it automatically generates the necessary configuration. Below, we’ll explain further with images.
Display the REST context menu. Here, select ‘Consume REST API…This time, select ‘Add single method.’A screen to configure the desired REST URL and request/response samples. Here, set up the request/body based on the API Reference provided in the previously mentioned ‘Completions’ link.Example of Body ConfigurationExample of Header ConfigurationUsing the body and header input details, input test data and perform a test run to confirm the request and response results. If there are no issues, press the ‘Finish’ button to automatically generate the main action and the necessary data structure for the request and response.
Once this is done, you can simply configure it as you would typically build logic in Outsystems: place the created REST API as an action in the flow, set the necessary inputs, and handle the return values to complete the API call.
Execution example: Feedback is returned for the entered report.
Summary
How was it? With the URL of a publicly available API and its reference, you can call it from Outsystems just like a regular action. However, be aware that publicly available APIs might have usage fees, limitations on the number of calls, or data capacity restrictions. So, please plan your usage carefully…
Team 2 will continue to share engineer blogs using low-code tools like Outsystems in the future, so stay tuned!
In this engineer’s blog, we explore the latest AI technologies impacting the field of medicine. In recent years, advances in artificial intelligence (AI) have made a significant mark on healthcare. Among the most notable developments are in drug discovery and protein folding. These areas are critical for understanding disease mechanisms and developing new drugs, and the introduction of AI has brought about major breakthroughs.
1. The Intersection of AI and Biology
The power of AI is transforming traditional methods in biological research. Proteins are fundamental building blocks of life, and understanding their three-dimensional structures is essential to uncovering cellular functions and disease causes. However, predicting these structures has long been a complex, time-consuming, and costly task.
This is where AlphaFold, developed by DeepMind, comes in. Leveraging AI, AlphaFold can accurately predict protein structures and has had a profound impact on the scientific community.
2. The Evolution of AlphaFold and a Nobel Prize Win
Since its introduction in 2020, AlphaFold has been widely used by researchers to unravel the structures of many previously unknown proteins. Its successor, AlphaFold2, brought even greater precision, significantly advancing both science and medicine.
In recognition of this groundbreaking achievement, the leading researchers behind AlphaFold2 were awarded the Nobel Prize in Chemistry. This milestone marked the moment when AI was officially acknowledged as a transformative force in pure science. AlphaFold2’s ability to determine protein structures within minutes or hours—something that once took months using conventional experimental methods—was especially praised.
Far more than just another AI tool, AlphaFold2 is now regarded as one of the foundational technologies in life sciences.
3. The Science of Protein Folding
How proteins fold is closely linked to their function and their role in diseases. Misfolded proteins, for example, can be the cause of conditions such as Alzheimer’s or Parkinson’s disease.
Previously, it could take months to identify the structure of a single protein using experimental techniques. With AI-driven predictions, this can now be done in just a few hours. As a result, both the speed and scale of research have dramatically improved.
4. Challenges and Future Prospects
Despite its promise, AI-based structure prediction still faces challenges. For instance, not all proteins function independently—they often work in complex interactions with others. Predicting such protein complexes remains an area of ongoing development.
Another challenge is how to interpret AI-generated results in a biologically meaningful way and translate them into clinical applications. Moving forward, the integration of AI with experimental data will become increasingly important.
5. A Revolution in Drug Discovery Powered by AI
AI is already being applied in the field of drug discovery, aiding in candidate compound selection, side effect prediction, and clinical trial optimization. This contributes to shorter development timelines and reduced costs.
AI’s role is particularly vital in advancing personalized medicine, where treatments are tailored to individual patients. In the future, AI may enable the development of therapies customized for each person.
6. The Future of Healthcare: Hope Brought by AI
AI’s rapid progress is reshaping the very nature of healthcare. We may soon see a future where AI analyzes symptoms, genetic data, and lifestyle habits to instantly suggest the most effective treatments.
Of course, ethical and privacy concerns remain, but overcoming these challenges could lead to faster, safer, and more effective medical care.
Conclusion
The convergence of AI and biology has the power to accelerate and refine our understanding of disease and the development of treatments like never before. With AlphaFold2’s Nobel Prize win, AI is no longer just a supporting tool—it has been officially recognized as a new driving force in science.
What we’re witnessing now is, quite literally, the future of medicine. It will be fascinating to watch how this technology continues to evolve and transform our lives.