Ishiki Labs Vision
The core of what we're building here at Ishiki Labs are multimodal AI systems that are truly "aware" — knowing when to stay silent and when to respond to real time video and audio streams. This is a fundamental part of how we work as humans, yet it is still an unsolved challenge for AI today, which are more-or-less fancy request-response systems. We believe this is a fundamental gap that needs to be solved before we can achieve embodied AI systems (robots) that can act autonomously in any environment.
Put another way, AIs today cannot be truly "proactive". They cannot guide you through cooking a recipe, recognizing when you're done a step or made a mistake. They cannot serendipitously comment on an interesting piece of art you just bought. They cannot point out what you missed on your shopping list at a grocery store.
Our long term vision is to build a multimodal AI that can emulate human consciousness and tackle robotics from a software first approach. We've started with the ability to act like a human in day to day meetings with our fern-0.1-base system, and have also built out procedural step-by-step guidance with the more experimental fern-0.1-multistep, both of which you can try below.
If you'd like to collaborate with us, please email us at founders@ishikilabs.ai
Try Fern
Video + audio AI that knows when you're talking with it and passively listens to the conversation context
Same as base plus multistep proactive procedural capabilities
fern-0.1-multistep examples