Google Developer Days Europe: A pleasant surprise

Manuel Lamelas Business Leave a Comment

“Too much developer stuff!”, I remember telling Inés prior to the event. Not much of an excuse, but guess I wasn’t really trying to convince anyone and thankfully, my colleague not only didn’t share my grumpy mood, but her usual cheerful attitude easily helped me overcome my reluctancy. So this is how our trip to Krakow began, with little expectations …

Big Data es el New Black

Datatons Business Leave a Comment

Todo el mundo de habla de Big Data,  sobre si es una de las tecnologías mas disruptivas de los últimos años, del potencial que tiene, de los problemas de privacidad derivados de las nuevas capacidades para almacenar datos… Es cierto que poco a poco, aunque cada vez son más, las empresas se están lanzando a la aventura de los datos con …

Data Logistics with Apache Nifi

Manuel Lamelas BigData Architecture, Technology Leave a Comment

As announced in a previous post we’re now going to introduce you to Apache Nifi, the latest trend in ingestion tools. A new project from the Apache Software Foundation that allows you to manage data flows with a cool graphical interface. If we didn’t catch your attention yet, wait until you hear this: NSA created it!!! Nifi – the UPS of data …

Kerberos & Hadoop: Securing Big Data (part I)

Celeste Duran BigData Architecture, Technology Leave a Comment

When I began to use Hadoop with Kerberos I felt as I was in the middle of the ocean. I found a lot of information about Kerberos technology but it was very difficult for me to find something about how to use it on Hadoop, why to use it and how to configure it for working with Hadoop. This trilogy of posts is going to …

Morphlines – Hadoop ETL by Cloudera

Manuel Lamelas BigData Architecture Leave a Comment

Today we are going to talk about Morphlines,  an open source framework developed by Cloudera, that provides a new way to do ETL on Hadoop. What are these morphlines? Morphlines are simple configurations files that defines how to transform data on the fly. It consists on a file that describes the steps a data flow has to pass in order to …

NoSQL vs Relational: Which database to use

Iván Alejandro Marugán BigData Architecture Leave a Comment

Nowadays information collection has changed a lot. Everybody wants to save more data and allow our users to consume that information in real time and in an easy way. This means that performance, scalability and availability are three key factors for database implementations. For this reason NoSQL databases have made their appearance. What’s a NoSQL database? A NoSQL database (“non SQL”, …

La Reinvención del Data Center en la Era Digital

Jorge Vidal Business Leave a Comment

Datatons tuvo la ocasión de participar en el evento de IDG COMMUNICATIONS que consiguió reunir a diversos profesionales destacados en lo que al mundo del CPD se refiere.   En este evento se discutieron temas interesantes y,que a día de hoy son ya una realidad, que repercutirá de manera negativa a las empresas que no sean capaces de reciclarse, ya sea por falta …

Historia de una relación complicada y como Scala LX lo cambió

Datatons Business Leave a Comment

Hace tiempo que tengo un romance extraño con el lenguaje de programación de Scala. Normalmente me suele dar un poco igual que herramienta utilizar a la hora de desarrollar una analítica (diferentes problemas, diferentes soluciones), pero he de confesar que Scala tiene algo que me convence y que me asusta a la vez. La primera vez que lo utilicé fue …

Participación en el Congreso Informática para tod@s – Inés Huertas

Datatons Business 2 Comments

El pasado 5 de Mayo tuve el placer de ser seleccionada para hablar en el congreso Informática para tod@s organizado en Mallorca y patrocinado por la ACM-W. Y digo que tuve el placer porque de verdad me encontré con grandes talentos, mujeres que se reinventan y luchan por dar visibilidad a su trabajo además de ser grandes profesionales en el mundo …

Random Forest – Modeling The Titanic Voyage with R

David Carrasco Data Science Leave a Comment

What’s a Random Forest? Random Forest is a machine learning algorithm used normally for classification and regression tasks in supervised learning which consists on an ensemble or group of simple decision tree models to predict the value of a target variable based on a bunch of input variables. The main advantage regarding a simple decision trees is that reduces the …