Extract, transform, and load. It doesn’t sound too complicated. But, as anyone who’s managed a data pipeline will tell you, the simple name hides a ton of complexity.And while none of the steps are easy, the part that gives data engineers nightmares is the transform. Taking raw data, cleaning it, filtering it, reshaping it, summarizing it, and rolling it up so that it’s ready for analysis. That’s where most of your time and energy goes, and it’s where there’s the most room for mistakes.If ETL is so hard, why do we do it this way?The answer, in short, is because there was no other option. Data warehouses couldn’t handle the raw data as it was extracted from source systems, in all its complexity and size. So the transform step was necessary before you could load and eventually query data. The cost, however, was steep.[ Also on InfoWorld: What is data mining? How analytics uncovers insights. • What is big data analytics? Everything you need … [Read more...] about ETL is dead
Want to Join?
With machines come automation. And with automation comes a huge shift in how we live. What will the future of work look like? What does automation mean for humanity’s collective and individual purpose? What role will automation play in the social and economic displacement of people?There’s a tendency to take a negative viewpoint where automation is involved. But the reason so many of us are pursuing this path forward is that it promises the potential to change society for the better.Here are just a few examples of how automation can positively reshape our lives.The reduction of laborWorking longer hours doesn’t translate into comparable output. In fact, the longer our hours become, the more inefficient we get. Long hours are correlated with ill health and lower performance, costing US companies an estimated $300 billion each year. Automation provides opportunities to reduce these working hours, rebalancing our work and personal lives—and returning cost benefits … [Read more...] about The drivers and social responsibility of automation
Information technology systems of the future are increasingly focused on where data is generated and processed, how it’s delivered and collected, and how quickly this data can move. Finding the most efficient path is key.Two of the most significant trends are the internet of things (IoT) and artificial intelligence (AI), which fit together like hand in glove. In a very simple form, IoT is about a multitude of devices exchanging data from a multitude of data points, which are being collected in a plethora of ways and on a plethora of platforms. That data must be quickly analyzed and in most cases, sent to the next level for further processing.Meanwhile, AI is about programmatically manipulating this big data to make real-time and time-sensitive decisions. The only way to build for this technological union is with a hybrid multicloud platform. The elements of hybrid IT infrastructure providing the most efficient path for AI and IoT form the foundation of technologies that will … [Read more...] about AI, IoT, and the hybrid cloud: the triumvirate of IT’s future
One of the first considerations for developers building mobile and web apps is how to handle account security, namely how they’re going to protect and authenticate their users and their data. The days where a username and password was sufficient to protect accounts are long behind us, and we’re reminded of this nearly every day in the form of large-scale breaches, high-profile account takeovers, or massive digital heists. Andrew Baker is a developer-educator at Twilio. Adding stronger account security such as two-factor authentication (2FA) to your app is one of the simplest ways to increase security, protect users from cyberattacks, and build trust in your product, all while maintaining a smooth user experience.This quick-start guides you through building a Node.js, AngularJS, and MongoDB application that restricts access to a URL. I’ll be demonstrating four methods of delivering 2FA: SMS, Voice, Soft Tokens and Push Notifications.[ Get the most out of collaborative … [Read more...] about Getting started with Twilio account security using Node.js and MongoDB
Two weeks ago, I spent time in Orlando, Florida, attending Microsoft’s huge IT pro and developer conference known as Microsoft Ignite. Having the opportunity to attend events such as this to see the latest in technological advancements is one of the highlights of my job. Every year, I am amazed at what new technologies are being made available to us. The pace of innovation has increased exponentially over the last five years. I can only imagine what the youth of today will bring to this world as our next generation’s creators.Microsoft’s CEO, Satya Nadella, kicked off the vision keynote on Day 1. As always, he gets the crowd pumped up with his inspirational speeches. If you saw Satya’s keynote last year, you could almost bet on what he was going to be talking about this year. His passion, and Microsoft’s mission, is to empower every person and every organization on the planet to achieve more. This is a bold statement, but one that I believe is possible. He … [Read more...] about AI and quantum computing: technology that’s fueling innovation and solving future problems
Initially open-sourced in 2012 and followed by its first stable release two years later, Apache Spark quickly became a prominent player in the big data space. Since then, its adoption by big data companies has been on the rise at an eye-catching rate.In-memory processingUndoubtedly a key feature of Spark, in-memory processing, is what makes the technology deliver the speed that dwarfs performance of conventional big data processing. But in-memory processing isn’t a new computing concept, and there is a long list of database and data-processing products with an underlying design of in-memory processing. Redis and VoltDB are a couple of examples. Another example is Apache Ignite, which is also equipped with in-memory processing capability supplemented by a WAL (write-ahead log) to address performance of big data queries and ACID (atomicity, consistency, isolation, durability) transactions.Evidently, the functionality of in-memory processing alone isn’t quite sufficient to … [Read more...] about The rise and predominance of Apache Spark
The business world is always in flux. New technologies and strategies are released at what feels like the speed of light. One of those technologies receiving a lot of talk on blogs and social media is robotic process automation (RPA) and artificial intelligence (AI). To help you wrap your arms around this emerging technology, I put together a high-level overview of the RPA landscape. RPA defined RPA is the use of machines to perform tasks with high precision at rapid intervals. Virtual robots (as opposed to the robots wandering in the Amazon warehouses) are used to perform automated steps in a process. A distinct aspect of virtual robots is that their actions are performed based on the end user perspective. This gives additional visibility into the viability of a software or process that a system monitoring tool would not address. Usually, these transactions are performed repetitively, and at high volumes, since the virtual robots are able to run 24/7. Types of … [Read more...] about Intro to robotic process automation
With more global digital transformations taking place, at the end of the day customer experience is the most important aspect an enterprise needs to ensure it’s investing in for business growth. Throughout each touchpoint, enterprises have the responsibility to engage and retain customers, and operating in real time is a key way to be successful. Things like knowing when a customer submits a support ticket, when there’s an issue with the supply chain or when there’s an available customer discount should all be able to be accomplished in real time, enabling all business functions to coordinate seamlessly, creating “wow” customer experiences.Today, customers expect things to be done instantaneously from a brand, without flaws, so a real time approach is essential. Without one, an enterprise can easily experience complications in terms of engineering, costs, resources, expertise, planning, vendor management, brand reputation, business loss and more. And most … [Read more...] about Embrace real-time data integration to create positive customer experiences
Some industries are slow to adapt their products to meet the changing demands of today’s consumers. Home security is one of those industries. Once the first models were built, they remained virtually unchanged long enough to become ineffective.The way people live today is not the same as it was fifty years ago. Today, people frequently travel outside their homes for extended periods of time and want to know that their property is safe. They’re used to having immediate contact with friends and family through smartphone apps, and expect the same from their home security system.Home security started out as a brilliant inventionIn the 1800s, the invention of the telegraph and battery inspired hobbyists and electrical engineers to pursue ways of improving telecommunication. Around this time, a Unitarian minister from Boston named Agustus Pope began inventing the first burglar alarm. Contemplating the dilemma of how to get his electronic invention to ring a bell, he found the … [Read more...] about Is AI going to save the home security industry?
At Atlassian, we focus a lot on career development and ensuring our engineers have a path to success at the company. A recent conversation about this got me thinking of a way that all engineers can increase their value to their teams, and even achieve a level of hero status at work. It could also lead to a new income opportunity down the road.Excelling at work means more than just performing well in the role you are assigned. Those who move up the ladder and get put on the exciting projects tend to be the ones who think broadly about their work and how they can improve themselves and their work environment. For example, you might identify bottlenecks to performance and doing something about it.The app marketplaces set up by enterprise software vendors provide a great opportunity for doing this. More and more vendors are opening up their platforms and tooling to let developers build apps that provide just the functionality companies need to work more efficiently.[ Download … [Read more...] about How an app marketplace can shorten your workday and advance your career
Data is the new oil. As enterprises increasingly recognize the value of digital assets, infrastructure considerations have evolved well beyond simply storing data into the growing field of data management.Traditional views on managing data typically involve block storage for database and application servers. But videos, images, sensor data, and other unstructured file data that can’t be easily stored in traditional, relational databases are bigger and much harder to manage.Real opportunity lies in data management, and unstructured data requires a different approach. With unstructured data predicted to grow at a compound annual growth rate of 29.8 percent through 2021, according to IDC Research, data backup and archive must evolve.[ Working with data in the cloud requires new thinking. InfoWorld shows you the way: How Cosmos DB ensures data consistency in the global cloud. | Stay up on the cloud with InfoWorld’s Cloud Computing Report newsletter. ]Here are four … [Read more...] about Data management: beyond just storing bits
Tech companies today spend a lot of time thinking about how to recruit and retain the best engineering talent. This is one of the reasons why perks—like daily catered meals, free snacks, and onsite game rooms—have become the subject of countless online slideshows about the tech industry.But if software engineering is one of your core competencies, these perks are nowhere near the most important things that you can be doing to improve the day-to-day experience of your most valuable employees. Instead, I believe that the key to engineer happiness is productivity: making it easier for developers to do their jobs and work on solving hard problems, without letting anything else get in their way.After talking to dozens of companies from startup or a 100-year-old corporations about ways to make their software engineering efforts scale, I believe that there are a handful of steps that can be recommended across the board. The suggestions below go a long way to addressing key pain … [Read more...] about The best perk to give software engineers
People think of autonomous vehicles as incredibly powerful, automatic machines, but the software that guides those machines is programmed by individual people in a manual effort. We’re spoiled with technology that seems borderline magical, but even the AI programs that manage to learn on their own started out in the hands of human beings.This manual effort is part of what’s slowing down the progress of self-driving cars (along with slow regulatory progress and logistical hurdles), but a new approach—microtasking—might offer a solution to the problem.The programming challenges of self-driving vehiclesFirst, let’s focus on the manual programming challenges of self-driving vehicles. The basic architecture of the programs used for autonomous vehicles aren’t ridiculously complicated; in fact, they’re mostly based on traffic laws. For example, you’ll need to teach a machine that red lights mean “stop” and green lights mean … [Read more...] about How microtasking is fueling a surge in AI growth for self-driving vehicles
At first glance, building a real-time application may sound like a daunting proposition, one that involves technical challenges as well as a significant financial investment, especially when you have an application goal of responding within a fraction of a second. But advances in hardware, networking, and software—both commercial as well as open source—make building real-time applications today very achievable. So what do these real-time applications look like?This article presents three common real-time application patterns that require a real-time decision, meaning a response returned or transaction executed based on real-time input. To determine which pattern to apply to your application, you must first define your real-time objective. Ask yourself: How fast does the application need to respond?Each application pattern addresses a particular level of real-time response: sub-millisecond, milliseconds, or 100 milliseconds and greater.[ Apache Solr is the hot tool for … [Read more...] about What real-time application pattern works for you?
The pricing models for compute resources in the cloud can be complicated. Some (but not all) variations include the following: On-demand instances Reserved/prepaid capacity Spot instances Dedicated instances On-demand pricing is pretty straightforward: for every hour that a compute resource runs you pay a certain hourly cost. Reserved pricing allows you to significantly reduce your hourly cost by committing, and prepaying, to run a compute resource for an agreed period of time. Spot pricing allows you to establish a maximum bid price for a compute resource and, if there is a resource available at or below that cost, you pay the current spot price. And dedicated instances cost the most, but give you dedicated hardware on which to run your application.With all of these options, how do you structure your compute strategy to guarantee that you have the resources you need, but minimize your cloud bill? In this post I review reserved pricing for prepaid capacity and the implications to your … [Read more...] about If you don’t make a reservation, you’re going to need to tip the maître d’
The emergence of cloud has led to an explosion of data that has left data scientists in high demand. A job that didn’t exist a decade ago has topped Glassdoor’s ranking of best roles in America for two years in a row, based on salary, job satisfaction, and number of job openings. It was even dubbed the “sexiest job of the 21st century” by the Harvard Business Review.Though growing in population, data scientists are scarce and busy. A recent study shows that demand for data scientists and analysts is projected to grow by 28 percent by 2020. This is on top of the current market need. According to LinkedIn, there are more than 11,000 data scientist job openings in the US as of late August. Unless something changes, this skills gap will continue to widen.Against this backdrop, helping data scientists work more efficiently should be a key priority. Which is why it’s an issue that currently, most data scientists spend only 20 percent of their time on actual data … [Read more...] about The 80/20 data science dilemma
You just have to look around you to see people everywhere browsing the internet through various devices. There’s an estimated 3.5 billion people worldwide now browsing the internet just through their mobile, that doesn’t include people who use other devices like tablets. With that number set to increase sevenfold between now and 2021, businesses now cannot afford to dismiss mobile and tablet internet usage.However, traditional mobile responsive websites have a significant Achilles Heel: their load times. A study by Google found that over half of consumers will abandon a website after just three seconds. Speed is clearly of the essence, and making it easy for consumers to access your content is even more important. You can understand, therefore, why many businesses saw apps as a way to engage with potential customers in a quick and relatively effective way.It used to be said that there was an app for everything, but the general public quickly fell out of love with them … [Read more...] about 8 reasons to develop a progressive web app
Artificial intelligence (AI) is permeating enterprise technology at a faster rate than ever before, yet it is still just the beginning of the adoption curve. From self-driving cars to intelligent assistants to cybersecurity to creating more personalized search results and recommendations, AI and its subsets, machine learning and deep-learning technology, are already making an impact on our experiences at home and at work.However, with all the upsides resulting from the trend toward using powerful datasets to generate new value, come downsides. I’m not referring to the doomsaying that AI will destroy us, but the unfortunate fact that software and technology vendors looking to capitalize on the hype will exaggerate their AI capabilities to get your attention and boost sales.The concept of “washing” isn’t new—we experienced it with “green washing” and even “cloud washing” and rightly should be skeptical of companies claiming to … [Read more...] about How to recognize (and avoid) ‘AI washing’
While mobile is the biggest digital channel, and app stores are swamped with new apps—as well as iterations of existing ones—daily, the ability to release faster and with higher automation coverage is still a big pain for most organizations. In the recent world quality report by Sogeti, the findings were that only 29 percent of the tests are being automated.When exploring the key reasons and challenges for such a low percentage, especially for the mobile space, we see five key challenges that can explain that low rate—some are very much related to the findings in the report.1. Tighter release schedules leave less time to automateOrganizations don’t have time to integrate new tests into their existing test cycles, and therefore have less time to develop new test code.[ Get the best office apps for your Android device, and explore 10 Android apps developers will love. | Keep up on key mobile developments and insights with the Mobile Tech Report newsletter. ]Due to … [Read more...] about 5 key challenges for mobile test automation
If you want to do the devops then you need to have a deep understanding of the principles, values, and concepts that drive it. Devops may be a trendy topic, but it brings together important concepts that come from multiple sources.In this list, I recommend five books that help lay that conceptual groundwork. All are highly influential and build an understanding of how you should think about scaled software delivery in large enterprises.[ What is devops? Discover how to transform software development. | Also: InfoWorld explains monitoring in the age of devops. ]While The Phoenix Project, The Devops Handbook, and Continuous Delivery form the core building blocks of the devops movement, those books should already be on the bookshelf of any devops enthusiast, and they are covered well in other lists—such as George Hulme’s ”5 great books on devops.” Here, I focus on books that provide additional context to these fundamental books and help frame the core … [Read more...] about 5 great books for devops enthusiasts