Oct 19, 2021 8 min read

How I turned my phone into a 3D mouse

This is the transcript of the talk by Akhil presented at Git Commit Show 2019.

About the speaker

At Git Commit Show, Akhil joined us from Hyderabad and demonstrated how he converted his phone into a 3D mouse. He demonstrated creating an amazing painting using his phone as a mouse. He shared his journey, challenges & future plans. He shared how he did it using Nodejs, web RTC, gyroscope sensors, DOM APIs etc. He also shard how his project can spawn multiple innovations. Akhil intends to open source his project.

Transcript

We have a smartphone and a TV screen. Actually, we can connect the smartphone to any TV screen or any computer screen using or web RTC. Even Byron used it for his communication video presentation. So we are actually using two browsers that communicate using the device orientation. So the way you turn your device, the pitcher would be displayed onto the screen. Actually, here we are using three things, Node JS, socket IO, and also p five JS for the painting. So it's not just a remote for the TV, but also here, you can connect your smartphone without any wire to any of your computer screens, if it has a browser, all you need is your smartphone to have one browser, your screen, whatever screen you are trying to connect to should have a browser and they must be connected to the same local network. And that's it. So when you start picking on your smartphone, it starts drawing onto any screen. So let me answer these three questions.

I have three questions because these are the main fundamental questions why you start any project.

Q. Why did I start this project?

This was a very experimental project for me. While I was learning socket, IO on WebRTC, I started thinking, why not we connect our smartphone and use it as a remote for any screen, we start interacting with that screen, we start drawing onto that painting. And so then I started out with this project, and started with socket IO learning and then came up with this thing.

Q. What exactly is this project and were actually, we are heading to this project?

So this is a very simplified fundamental browser project, where you would be only having two browsers interacting with each other, you need not have any IR blaster or you know, Bluetooth connections or anything, all you what I need to do is just connect to your local network and start drawing onto your screen. And how did I do it as I continuously repeated right, I have used socket IO, web RTC and Node JS for drawing this thing.

Here is a demo. When you start getting onto the smartphone and start rotating, you are orienting your device, it is in 3d space, but you convert it into the 2d space of your computer screen and start drawing with your smartphone. And when it comes to the colors, the way you rotate the speed with which you rotate it, or you start orienting, it would be changing the colors. So this is very different math than what we have learned in our 11th Plus or wealth class, and I Have just upgraded. So this is more of an educational project, not a startup product or anything I want to put it I'm still working on it. So I guess I would be releasing it in July or maybe the first recapitulate.

Q. Where should I learn all these things?

So we should learn these things. how, you know, the web is a very large space where you can learn from any website on anything and we have great communities. Socket IO has a great community. You can start learning from sockets and start understanding how to process and communicate how devices can communicate.

Then you have web RTC, as also said by Byron. You can see how powerful robotics is. You can start streaming videos with two different lines so WebRTC can be used over there. Everyone would be using this and we would help people. As I want to stress on this thing, it's an excellent community out in the open-source where you can have this canvas drawn onto your web browser. And you can start interacting with it. It has an excellent communication community, you can start learning how you can do, and different kinds of programming and all those things are very simple, like additions and subtractions, Cartesian trigonometry, all those technical stuff, mathematical stuff.

Now it's time for questions and answers

In all of this Node JS web RTC, one thing that I'm very curious about is this device orientation. How do you move your hands? How do you enable this? And at what scales? What kind of accuracy? Or do you see that? What kind of tools and technology do you use to control this orientation?

See, we have smartphones, we have two modes for it, right? portrait mode and landscape mode, right? So did you ever wonder how a phone would be understanding you are in the portrait mode or landscape mode, it could be using a guide or sensor, It has a guitar sensor embedded in it? So it would be saying, okay, the device has oriented till five degrees to the x-axis, or 10 degrees to the y axis or something, and it starts calculating, okay. So this is a little bit of mathematical stuff, where you would be calculating two different planes, you are on the earth, and have a different rotation. And your smartphone has a different rotation, right? So how to navigate both the stuff and start understanding what is the actual orientation of your device. So what technology I have used here is a simple web APA Dom API, we have device orientation, APA, which is really left out. Nobody uses it because they think it is useless. So I thought, why is it useless? It's hard, it has an API, let's start using it. So when you use this API, you will be getting the degrees of orientation regarding your smartphone. But also you have to calculate what's your orient, what's the orientation of your that, let's say, probably, rear-facing to the north side, on your smartphone

is facing the website, then orientation would be changed. So all this stuff, you have

to be using trigonometry. But two as a synopsis of the complete answer to this question, I used device orientation, web API, and the mathematical thing where you convert 3d space into 2d using Cartesian.

2. What were the challenges, major challenges in this project?

The first thing was, I am very weak at math. I do not understand how you know how to change 3d and all those things. That was a major challenge because you'd be getting x y z, both the three things from your API, and you have to convert it into only x and y, ignoring z. But here is a thing. If you ignore the z component, then it's no use, right? You have to make sure that quad coordinate is also projected onto your screen. So that was the main challenge, I had to learn, again, mathematics from the books have a class and ninth and 10th class that to understand what exactly is going on, and how to go into the basics. So that was the first challenge. The second challenge was regarding the socket IO, how it works, and how to make sure that it is more performant. Even when you saw Byron's demo, he said that the main point he said there was to increase the performance of it. So performance increase is very low in the socket IO, and we have to increase it for this thing. And that was a second challenge. And the third challenge was, it was not more of a technical thing, but also, but of non-technical building your idea. So this is just a painting on your wall, who would care about this thing, right?

Who would care, just connecting his smartphone to your screen, which has literally when you see it, it is just an artistic thing, but not to have any technical thing. But you have to believe in your idea to go through the complete process. It took almost two weeks to complete my work time. So believing in your project is the third challenge where you have to have a sustained belief in your project and start going, completing the project till the end, and displaying it to the audience. So I think those are the three main challenges in the project.

3. What is the future? Where do you see this project going? Where do you see this concept utilizing in different use cases?

Yeah, even I had these two questions. “Wait, Should I go with this project? Should I makes it?” When I introduced the sermon when I showed this video to my friends and some colleagues, they said, yeah, it's awesome. Why not commercialize it. But I had a different idea of why not open-source data and make sure that the technology is available to everyone. Because what I learned is from open sources, I started learning to program from open source communities. And if I do something on the basis of open-source, it must be open-source instead of commercializing it. So I don't think it would be going towards commercialization or anything. And the second thing is what use cases actually, when I posted this thing onto so few websites on LinkedIn or somewhere, some other blogging sites, they said that it looks so satisfying. And it might be helpful for those people who might mean to keep on everything. So I started digging there, probably why not have any human aspect for this product? Why only a monetary aspect? why not have a human humanitarian aspect for this and started digging there. As kids, there is a thing called autism. So I think, probably I want to mold this product, as they're not aware of this project as a tool for those kids with autism to communicate with the people to draw things and everything. For example, what autism means is, and instances, kids with autism don't directly communicate. They will be very silent and calm. They don't use normal ways of communication. They try to communicate with paintings or some drawings or probably some other things. So why not use this as a tool for those people. The short is running as of now in my mind. The second thing is if we want to go through the monetary thing, I guess we can use it for the art installations in museums, probably museums or in some other malls and everything. That's it for a few sketches or a medium for them to communicate with the people, or to start drawing to enhance their skills and everything.

The use cases would be what I am seeing here now is we can use it for the art installations generally in India, we do not have that many art installations and digital art installations or something. But all over the world, across the world, we have many art digital art installations where people can see art on the computer screens, but the thing they miss out there is they cannot generally interact, even if they want to interact, they should have some touchscreen, screens which are very big, and they might be too costly to offer. So I'm thinking why not use our own devices to interact with those things with the art installations, so it might be useful in that use case. And the second thing is in advertisements that are displayed, you know, the advertisements that are displayed in Metro trains or anything, probably people can start using them. And even in the museums, where people are not allowed to test the actual artifacts. But if you want to test them, you can have a feel of touching them and how they interact with the touches, then we can install huge screens where we have actual digital artifacts there. And people can start interacting with their devices. So this does not cause any harm to the artifacts, actual artifacts in the museum. And also people can have this digitized interaction with the museum artifacts. So those are the two use cases I am looking into.

Thank you so much!!