Category Archives: Society

Data health and … delusion

For once, I’ll write about data but not in a technical way.

For once I’ll write about something some notice without really caring because, this is research after all.

I’m not a typical computer scientist. I ain’t learned it by the book but by doing. My very first code was in R because I couldn’t afford a statistical software and … I loved it. Then moving to matlab and discovering object oriented programming thanks to a post-doc to whom computer had few to zero secrets. By that time (we’re in 2013), I used to be a student in neurosciences and even using “cd” command on a terminal was making me feeling like a pro. Big data was the buzz-word everybody started to have at the corner of the mouth. What I was not able to see yet was that in research lab, big data was already a reality since quite a moment. Terabytes of EEG recordings stored in hard drives (sometimes on CDs), microscopy images, videos and csv files of behavioral experiments on humans, rodents … Most of the time, if you were not working in the same lab than the team recording those data or in collaboration with them, there was no way to access them. But all this seemed normal, the big data tools were in their early stages and datascientist not even a title yet.

Then I finished my master and traveled the world a little. And, coming back to France I decided to become a developper. I started to look at some webinars and got interested in databases. I always loved to imagine ways to store things in an efficient way that will make me recover them without any effort. As a lot of people I first heard of tables and SQL. Then curiosity lead me to NoSQL and column databases. I also discovered cloud computing and all the sharing and storing opportunities they offer. It seemed like a real universe of inifinite possibilities was opening. By that time I was working for Capgemini in Toulouse on projects with the French space agency (CNES). Big data were really big (from Tera, I jumpted to Peta) and neurosciences research conditions far behind.

While coming back to neurons (artificial ones) by developping artificial intelligence systems at Elter, I started to be a database user. As you be might be aware of, AI needs Big Data. Then I realised that despite the huge amount of data available, chasing them was not that easy. But on an other hand it was understandable that, data is money so it seemed normal for a company to struggle accessing to other companies data. I then learned how to generate data in a way that will make them usable for data-science projects. I also learned how to parse the web looking for pertinent data that i could re-use. Most of the time those data were coming from research labs. And all the time these data were stored in a different way. Different file format, different folder architecture… This has lead me to the idea of a common way of storing things. There are so many formats for images (jpegs, pngs, tif, bin…), for tables (CSVs, xlsx, txt …). Tensorflow released a way to store data ready for learning usage with their TFRecord format. This is interesting when doing AI using tensorflow but of no use when doing just data-science or using caffe framework. Soon a series of conversion tools arised in my computer: tif2bin, csv2tfrec … One of my new mantra when starting a project was to define naming rules and file architectures with the clients or colleagues not to waste time refactoring everything. In the end, despite the cloud, despite the databases, everything was still stored back on hard drives when time was coming to use it for processing (unless you’re using cloud computing but this cannot alway be possible).

Then a worldwide episode: COVID-19. If an event should have played a key role in helping human being to work in cooperation, this is this one. While doing remote working (like millions of people on the planet) I started to look for initiatives and way to help with my poor data skills. What I found was quite amazing. Hundreds of initiatives all over the world. And none of them seemed to be aware of the others. Thousands of data website. No real way to ensure their veracity (one of the magic Vs of big data). A seed was planted in between two neurons. Why no organization, no institution is devoted to world critical data management?

Lately, I’ve been working on a personnal project. Creating an AI helping divers to recognize the fishes they saw during their dive. In addition to having a playful side, this project aims at helping researches who wish to know more about fish population in different part of the globe. The goal is to add information about divind depth and water temperature and make them available in a formatted way to be used efficiently by any research team in the world. As I am working now in a research institution and research is a more open world, I thought I could access more data. Those hidden to companies but shared among the community in order to make some progress. Indeed, there is competition for financing but once the research is done and the paper published, the data should be made available. Here again, I was quite disappointed. Data is there but ways to make it available, hard and expensive. Researchers are not computer scientists and asking them to share their data in a platform that necessitate some programming skills a waste of time. Despite the storage ability, when data is stored, there is no worldwide agreement on the how. Then each lab that manage to make some of their data available often put raw data with a README file to explain how to use them (when there is such a file). This file is often harder to read than a book in ancient greek. No homogeneity in the format and then hours and hours to download the data understand them, realize that the information inside do not match to what we need and start all over again. The seed that was planted few month ago then started to give rise to sprouts. As we are facing more and more world issues, ocean temperature rise, deforestation, CO2 and now viruses, research topics led on one side of the planet are directly liked to the one conducted on the other side. Data science tools and techniques are meant to make bridges in between research teams but data accessibility is too poor.

The idea behind this is the creation of an institution, a worldwide institution which be the guardian of data. When a research team publish its paper his only duty is to give the data to this institution that will store it in an homogenous way. It will make it accessible only to non-profit usage and institutions. Any lab member could query its database and have in minutes access to data he might not be aware of. The name of the lab that generates them, this could lead to new collaborations, avoid wasting time in doing again things that were unsuccessful. Lines are still blurry around what such an institution should and shouldn’t be and I’m nobody to draw them. I just wanted through these lines to sow a seed in some people’s mind. Maybe this will spread to a more concrete action.

Thanks for reading me I’ll be very interested to read your comments on this.

Is remote the new office?

Lately, due to the COVID-19 crisis, remote working has been the norm for most companies. Lots of people are writing articles about what remote is, they are giving tips about how to perform while remote etc. On the other hand, others are writing about how to get back to work after being home for a while. How to reinvent interactions and so on. In the middle of this, some are talking about re-inventing work. This article is more an essay about why we already did re-invent work, how to benefits from de-confinement to develop new working habits and the impact of all this on working performances, and ecology. Let’s go.

During the ongoing crisis, we have been asked to work from home (when it is possible of course). Thus we had to adapt using a lot of tools in a new way or sometimes for the first time. Communication within teams has changed. Communication with clients too. More efforts have been made to listen to each other in order to make things work. More patience too and logically more kindness. All this is the key to an efficient and constructive way of working. And if this remained stressful for some of us due to the lack of social bonds, is this the reason to go back to our regular offices and good old (bad?) habits?

Indeed we’ve been forced to take new habits we already did the most difficult part of the job. What if instead of hurrying going back to your grey office under neon lights, we decided to keep on working from home and enjoy more time with our kids and friends? What if when we need to work surrounded by people our company offers to pay for co-working space? What if we reduced face-to-face meetings to their strict minimum needed?

The first point looks like what we live today, right? But with the freedom to go wherever you want during your more frequent free time. With the freedom of your working hours even though restricted to some meeting or availability for some colleagues. The key to this is the discipline that you probably already acquired during confinement. This is the part of the path to a new way of working we already made.

The second point is interesting. Indeed as human beings, we are a social animal. Of the kind that really needs to make social boundaries on an everyday purpose to make its social order stable (The Book Sapiens was an excellent choice to read during my stay home). If social bound is what you need, why having it with your colleagues? You probably like them for sure, but why not do what was efficient for you during your most brain-demanding period: university? During this period, students are working together whether or not they are coming from the same field, they mostly need to be friends. From time to time, we had to make team works on specific topics (Those minimum restricted meetings). The rest of the time, we were surrounded by people we love and respect and had no qualms about challenging our thoughts and helped us see the world differently. Going working on co-working places can have this effect. You keep on having the coffee discussion Eureka effect, you surround with people that do not care to say you’re wrong (which is not necessarily the case at work) and can help you come up with new breaking ideas. On the other hand, when you need a specific technical solution, you’re always up to Skype, Slack or any other social tool, your colleagues.

The last part of this reflection is about the consequences of all this. Let’s start with professional advantages. You spend fewer hours in traffic, you’re less stressed, your work/family balance is better. You work surrounded by people you choose making it easier to handle challenges of issues in your work. This leads to better performances for your company and better daily life for you. Your company needs less building infrastructure, so it spends less money on it. You’re also healthier, so spend fewer days in medical leave. Sounds good right?

What about the other effect: the ecological one. First, fewer people on the road, so logically this leads to less pollution in big cities. If you’re going to co-working places, it’s more likely that you’ll choose one that is close to your home you can reach easily. Fewer buildings dedicated to office work. What to do with all those buildings? Some can be turned into co-working places, some others to social housing and the remaining can be destroyed to augment the proportion of parks and green spaces in cities and why not going further and create downtown kitchen gardens. We were able to see the impact of two-month lockdown, imagine the impact we could have is we keep on the same way.

To conclude this article just a few lines about the people that cannot work remote. Those persons too will benefit from remote working of others at least for their daily travel to work that will be significantly lowered. Maybe if the traffic is less important they will consider taking a bike instead of a car (also having a proper shower at work helps a lot). This change can seem hard, or nothing or pointless to some. To my opinion, humanity needs to get some lessons and reminders from mother nature sometimes, and we would be idiots not to listen to Her. Feel free to comment I’d be glad to have your opinion on this.