New best story on Hacker News: Launch HN: Aqua Voice (YC W24) – Voice-driven text editor

Launch HN: Aqua Voice (YC W24) – Voice-driven text editor
373 by the_king | 127 comments on
Hey HN! We’re Jack and Finn from Aqua Voice ( https://withaqua.com/ ). Aqua is a voice-native document editor that combines reliable dictation and natural language commands, letting you say things like: “make this a list” or “it’s Erin with an E” or “add an inline citation here for page 86 of this book”. Here is a demo: https://youtu.be/qwSAKg1YafM . Finn, who is big-time dyslexic, has been using dictation software since the sixth grade when his dad set him up on Dragon Dictation. He used it through school to write papers, and has been keeping his own transcription benchmarks since college. All that time, writing with your voice has remained a cumbersome and brittle experience that is riddled with painpoints. Dictation software is still terrible. All the solutions basically compete on accuracy (i.e. speech recognition), but none of them deal with the fundamentally brittle nature of the text that they generate. They don't try to format text correctly and require you to learn a bunch of specialized commands, which often are not worth it. They're not even close to a voice replacement for a keyboard. Even post LLM, you are limited to a set of specific commands and the most accurate models don’t have any commands. Outside of these rules, the models have no sense for what is an instruction and what is content. You can’t say “and format this like an email” or “make the last bullet point shorter”. Aqua solves this. This problem is important to Finn and millions of other people who would write with their voice if they could. Initially, we didn't think of it as a startup project. It was just something we wanted for ourselves. We thought maybe we'd write a novel with it - or something. After friends started asking to use the early versions of Aqua, it occurred to us that, if we didn't build it, maybe nobody would. Aqua Voice is a text editor that you talk to like a person. Depending on the way that you say it and the context in which you're operating, Aqua decides whether to transcribe what you said verbatim, execute a command, or subtly modify what you said into what you meant to write. For example, if you were to dictate: "Gryphons have classic forms resembling shield volcanoes," Aqua would output your text verbatim. But if you stumble over your words or start a sentence over a few times, Aqua is smart enough to figure that out and to only take the last version of the sentence. The vision is not only to provide a more natural dictation experience, but to enable for the first time an AI-writing experience that feels natural and collaborative. This requires moving away from using LLMs for one-off chat requests and towards something that is more like streaming where you are in constant contact with the model. Voice is the natural medium for this. Aqua is actually 6 models working together to transcribe, interpret, and rewrite the document according to your intent. Technically, executing a real-time voice application with a language model at its core requires complex coordination between multiple pieces. We use MoE transcription to outperform what was previously thought possible in terms of real-time accuracy. Then we sync up with a language model to determine what should be on the screen as quickly as possible. The model isn't perfect, but it is ready for early adopters and we’ve already been getting feedback from grateful users. For example, a historian with carpal tunnel sent us an email he wrote using Aqua and said that he is now able to be five times as productive as he was previously. We've heard from other people with disabilities that prevent them from typing. We've also seen good adoption from people who are dyslexic or simply prefer talking to typing. It’s being used for everything from emails to brainstorming to papers to legal briefings. While there is much left to do in terms of latency and robustness, the best experiences with Aqua are beginning to feel magical. We would love for you to try it out and give us feedback, which you can do with no account on https://withaqua.com . If you find it useful, it’s $10/month after a 1000-token free trial. (We want to bump the free trial in the future, but we're a small team, and running this thing isn’t cheap.) We’d love to hear your ideas and comments with voice-to-text!

New best story on News: Launch HN: Aqua Voice (YC W24) – Voice-driven text editor

Launch HN: Aqua Voice (YC W24) – Voice-driven text editor
369 by the_king | 126 comments .
Hey HN! We’re Jack and Finn from Aqua Voice ( https://withaqua.com/ ). Aqua is a voice-native document editor that combines reliable dictation and natural language commands, letting you say things like: “make this a list” or “it’s Erin with an E” or “add an inline citation here for page 86 of this book”. Here is a demo: https://youtu.be/qwSAKg1YafM . Finn, who is big-time dyslexic, has been using dictation software since the sixth grade when his dad set him up on Dragon Dictation. He used it through school to write papers, and has been keeping his own transcription benchmarks since college. All that time, writing with your voice has remained a cumbersome and brittle experience that is riddled with painpoints. Dictation software is still terrible. All the solutions basically compete on accuracy (i.e. speech recognition), but none of them deal with the fundamentally brittle nature of the text that they generate. They don't try to format text correctly and require you to learn a bunch of specialized commands, which often are not worth it. They're not even close to a voice replacement for a keyboard. Even post LLM, you are limited to a set of specific commands and the most accurate models don’t have any commands. Outside of these rules, the models have no sense for what is an instruction and what is content. You can’t say “and format this like an email” or “make the last bullet point shorter”. Aqua solves this. This problem is important to Finn and millions of other people who would write with their voice if they could. Initially, we didn't think of it as a startup project. It was just something we wanted for ourselves. We thought maybe we'd write a novel with it - or something. After friends started asking to use the early versions of Aqua, it occurred to us that, if we didn't build it, maybe nobody would. Aqua Voice is a text editor that you talk to like a person. Depending on the way that you say it and the context in which you're operating, Aqua decides whether to transcribe what you said verbatim, execute a command, or subtly modify what you said into what you meant to write. For example, if you were to dictate: "Gryphons have classic forms resembling shield volcanoes," Aqua would output your text verbatim. But if you stumble over your words or start a sentence over a few times, Aqua is smart enough to figure that out and to only take the last version of the sentence. The vision is not only to provide a more natural dictation experience, but to enable for the first time an AI-writing experience that feels natural and collaborative. This requires moving away from using LLMs for one-off chat requests and towards something that is more like streaming where you are in constant contact with the model. Voice is the natural medium for this. Aqua is actually 6 models working together to transcribe, interpret, and rewrite the document according to your intent. Technically, executing a real-time voice application with a language model at its core requires complex coordination between multiple pieces. We use MoE transcription to outperform what was previously thought possible in terms of real-time accuracy. Then we sync up with a language model to determine what should be on the screen as quickly as possible. The model isn't perfect, but it is ready for early adopters and we’ve already been getting feedback from grateful users. For example, a historian with carpal tunnel sent us an email he wrote using Aqua and said that he is now able to be five times as productive as he was previously. We've heard from other people with disabilities that prevent them from typing. We've also seen good adoption from people who are dyslexic or simply prefer talking to typing. It’s being used for everything from emails to brainstorming to papers to legal briefings. While there is much left to do in terms of latency and robustness, the best experiences with Aqua are beginning to feel magical. We would love for you to try it out and give us feedback, which you can do with no account on https://withaqua.com . If you find it useful, it’s $10/month after a 1000-token free trial. (We want to bump the free trial in the future, but we're a small team, and running this thing isn’t cheap.) We’d love to hear your ideas and comments with voice-to-text!

New best story on Hacker News: New Aztec Codices Discovered: The Codices of San Andrés Tetepilco

New Aztec Codices Discovered: The Codices of San Andrés Tetepilco
370 by dzdt | 150 comments on


New best story on News: The Francis Scott Key Bridge in Baltimore, Maryland Has Collapsed

The Francis Scott Key Bridge in Baltimore, Maryland Has Collapsed
512 by repelsteeltje | 370 comments on News.


New best story on Hacker News: The Francis Scott Key Bridge in Baltimore, Maryland Has Collapsed

The Francis Scott Key Bridge in Baltimore, Maryland Has Collapsed
505 by repelsteeltje | 359 comments on


New best story on News: The Francis Scott Key Bridge in Baltimore, Maryland Has Collapsed

The Francis Scott Key Bridge in Baltimore, Maryland Has Collapsed
497 by repelsteeltje | 349 comments .


New best story on News: Show HN: Memories – FOSS Google Photos alternative built for high performance

Show HN: Memories – FOSS Google Photos alternative built for high performance
694 by radialapps | 201 comments on News.
Memories is a FOSS Google Photos alternative that you can self-host (it runs as a Nextcloud plugin). Website: https://ift.tt/tlVrPKQ GitHub: https://ift.tt/4HML7Pr Demo Server: https://ift.tt/LekHxBN (demo runs in San Francisco on a free-tier cloud vm) Memories has been built ground-up for high performance and is extremely fast when configured correctly. In our testing environment, it can load a timeline view with 100k photos in under 500ms, including query and rendering time! Some features to highlight: * A timeline similar to Google Photos where you can skip to any time in history instantly. * AI-based tagging that runs locally on your server, identifying and tagging people and objects. * Albums and external sharing. * Metadata editing support * A world map of your photos, supported both on mobile and the web * Did I mention it's extremely fast? Would love to hear feedback from the HN community! :)

New best story on Hacker News: Show HN: Memories – FOSS Google Photos alternative built for high performance

Show HN: Memories – FOSS Google Photos alternative built for high performance
690 by radialapps | 201 comments on
Memories is a FOSS Google Photos alternative that you can self-host (it runs as a Nextcloud plugin). Website: https://ift.tt/ojtJZYu GitHub: https://ift.tt/SoXTCYK Demo Server: https://ift.tt/2krdltI (demo runs in San Francisco on a free-tier cloud vm) Memories has been built ground-up for high performance and is extremely fast when configured correctly. In our testing environment, it can load a timeline view with 100k photos in under 500ms, including query and rendering time! Some features to highlight: * A timeline similar to Google Photos where you can skip to any time in history instantly. * AI-based tagging that runs locally on your server, identifying and tagging people and objects. * Albums and external sharing. * Metadata editing support * A world map of your photos, supported both on mobile and the web * Did I mention it's extremely fast? Would love to hear feedback from the HN community! :)

New best story on News: Show HN: Memories – FOSS Google Photos alternative built for high performance

Show HN: Memories – FOSS Google Photos alternative built for high performance
686 by radialapps | 201 comments .
Memories is a FOSS Google Photos alternative that you can self-host (it runs as a Nextcloud plugin). Website: https://ift.tt/ojtJZYu GitHub: https://ift.tt/SoXTCYK Demo Server: https://ift.tt/2krdltI (demo runs in San Francisco on a free-tier cloud vm) Memories has been built ground-up for high performance and is extremely fast when configured correctly. In our testing environment, it can load a timeline view with 100k photos in under 500ms, including query and rendering time! Some features to highlight: * A timeline similar to Google Photos where you can skip to any time in history instantly. * AI-based tagging that runs locally on your server, identifying and tagging people and objects. * Albums and external sharing. * Metadata editing support * A world map of your photos, supported both on mobile and the web * Did I mention it's extremely fast? Would love to hear feedback from the HN community! :)

New best story on News: ChatControl: EU wants to scan all private messages, even in encrypted apps

ChatControl: EU wants to scan all private messages, even in encrypted apps 942 by Metalhearf | 515 comments on News.