Big Data Analytics
Big Data Analytics is the practice of examining massive volumes of data to uncover patterns, correlations, and actionable insights. As organizations across all sectors generate data at unprecedented rates, the ability to process and analyze this information becomes a cornerstone of competitive advantage. The journey begins with robust data collection and storage systems that enable seamless acquisition of structured and unstructured data from diverse sources. Ensuring data quality through meticulous data cleaning and preprocessing is crucial before meaningful data analysis can be conducted.
Visual representation plays an essential role in communicating complex findings. Using effective data visualization techniques, analysts can present insights in ways that inform and persuade decision-makers. Increasingly, big data strategies are customized through domain-specific analytics, which tailors solutions to fields like healthcare, finance, transportation, and retail.
The tools that power big data solutions are evolving rapidly. Staying informed about the latest tools and technologies in data science is critical for professionals aiming to stay at the forefront of innovation. Moreover, ethical considerations are paramount, particularly when analytics intersects with personal or sensitive information. The study of ethical and social aspects ensures that data is used responsibly and transparently.
Big data also intersects significantly with cybersecurity, where analytics is used to detect threats and prevent attacks. From monitoring user behavior to detecting anomalies in network traffic, big data systems support threat intelligence and incident response and forensics. These capabilities are bolstered by AI and ML in cybersecurity, which automate the detection of complex attack patterns.
Analysts frequently engage with teams responsible for application security and network security to ensure that the infrastructure supporting analytics platforms remains protected. Security frameworks are guided by best practices in cybersecurity policy and often involve identity and access management to control data access.
Specialized sectors such as CPS security and OT security benefit from big data techniques to monitor and respond to anomalies in real-time industrial systems. Encryption methods rooted in cryptography help secure data transmission, while teams trained in ethical hacking simulate attacks to test analytics system resilience.
Training and culture play an equally important role. Professionals must understand the value of cybersecurity awareness and stay attuned to emerging areas in cybersecurity, which evolve alongside data technology. As many analytics platforms operate in the cloud, data professionals must coordinate with teams responsible for cloud security and endpoint security.
In short, Big Data Analytics is more than just a technical discipline—it is a strategic function that transforms raw information into insights that shape policy, innovation, and security. Its synergy with cybersecurity, ethical governance, and emerging technologies makes it one of the most important fields in today’s data-driven world.
Table of Contents
Distributed Computing
Distributed computing involves breaking down large datasets into smaller chunks and processing them across multiple machines in parallel. This approach improves efficiency and enables the analysis of petabytes of data.
Key Features:
- Distributes data and processing tasks across clusters of machines.
- Fault-tolerant: Automatically handles hardware failures by replicating data across nodes.
- Scalable: Easily expands to accommodate growing data needs.
Key Technologies:
- Hadoop: A framework for distributed storage (HDFS) and processing of large datasets using the MapReduce programming model.
- HDFS: A distributed file system for storing massive datasets.
- MapReduce: Processes data by dividing tasks into smaller, parallel operations.
- Apache Spark: A faster, in-memory distributed computing engine.
- Supports a variety of workloads: batch processing, stream processing, machine learning, and graph analytics.
- Hadoop: A framework for distributed storage (HDFS) and processing of large datasets using the MapReduce programming model.
Applications:
- Analyzing user behavior on e-commerce websites.
- Processing and aggregating financial transactions at scale.
- Genomic data analysis for personalized medicine.
Real-Time Analytics
Real-time analytics focuses on analyzing data streams as they are generated, enabling immediate decision-making and action. It is essential for scenarios where timing is critical, such as fraud detection or monitoring live systems.
Key Features:
- Processes continuous streams of data with low latency.
- Enables businesses to act on data insights in real time.
Key Technologies:
- Apache Kafka: A distributed event streaming platform that enables high-throughput, fault-tolerant messaging.
- Example: Streaming logs from IoT devices for predictive maintenance.
- Apache Flink: A stream processing framework for real-time analytics.
- Features stateful computations and advanced event-time processing.
- Apache Kafka: A distributed event streaming platform that enables high-throughput, fault-tolerant messaging.
Applications:
- Fraud Detection: Identifying fraudulent transactions as they occur in financial systems.
- IoT Analytics: Monitoring sensor data in smart cities or industrial applications.
- Stock Market Analysis: Processing real-time trading data to identify market trend
Cloud Computing
Cloud computing provides a scalable and cost-effective solution for big data analytics by offering on-demand resources for data storage and processing. It eliminates the need for expensive on-premises infrastructure and allows organizations to scale up or down based on workload demands.
Key Features:
- On-demand resource provisioning.
- Global accessibility and high availability.
- Pay-as-you-go pricing models reduce upfront costs.
Major Cloud Platforms:
- Amazon Web Services (AWS):
- Amazon EMR: Managed Hadoop and Spark for big data processing.
- Redshift: Cloud-based data warehousing for large-scale analytics.
- Athena: Serverless query service for analyzing data stored in S3.
- Microsoft Azure:
- Azure Synapse Analytics: Unified analytics platform for big data and data warehousing.
- Azure Data Lake: Scalable storage for big data processing.
- Google Cloud:
- BigQuery: A serverless data warehouse for querying massive datasets.
- Cloud Dataflow: Real-time and batch data processing.
- Amazon Web Services (AWS):
Applications:
- Hosting large-scale recommendation systems for e-commerce platforms.
- Managing customer analytics for SaaS companies.
- Analyzing global supply chain data in logistics and manufacturing.
How These Sub-Areas Interconnect
Distributed Computing
forms the backbone for handling large-scale data processing, with frameworks like Hadoop and Spark dividing workloads across clusters.
Real-Time Analytics
builds on distributed systems to process streaming data, ensuring insights are actionable immediately.
Cloud Computing
integrates distributed and real-time analytics with elastic scalability and global reach, enabling cost-effective big data solutions.
Challenges in Big Data Analytics
Data Volume:
Storing and processing petabytes or exabytes of data requires robust infrastructure.
Data Variety:
Managing structured, unstructured, and semi-structured data formats.
Data Velocity:
Processing and analyzing data streams in real-time.
Cost Management:
Balancing infrastructure costs with performance needs.
Data Security:
Ensuring sensitive data is protected in distributed and cloud environments.
Why Study Big Data Analytics
Making Sense of Massive and Complex Data
Understanding the “3 Vs” of Big Data
Gaining Hands-On Experience with Scalable Tools
Driving Innovation and Competitive Advantage
Preparing for High-Growth Careers in the Data Economy
Big Data Analytics: Conclusion
Big Data Analytics is revolutionizing industries by enabling organizations to harness vast amounts of data for actionable insights. By leveraging distributed computing, real-time analytics, and cloud computing, businesses can respond to market demands, optimize operations, and innovate at scale.
Big Data Analytics: Review Questions and Answers:
1. What is big data analytics and why is it important in modern business?
Answer: Big data analytics involves examining large and varied data sets to uncover hidden patterns, correlations, and insights that can drive strategic decision-making. It is important in modern business because it enables companies to harness vast amounts of data from diverse sources and convert them into actionable intelligence. This process helps improve operational efficiency, optimize marketing strategies, and predict future trends. By leveraging big data analytics, organizations can achieve a competitive edge and respond swiftly to market dynamics.
2. How do machine learning techniques enhance big data analytics?
Answer: Machine learning techniques enhance big data analytics by automating the identification of complex patterns and relationships within large datasets. These techniques allow models to learn from historical data and make predictions or decisions without being explicitly programmed for every scenario. This capability enables businesses to gain deeper insights and improve forecasting accuracy. Consequently, machine learning drives efficiency in processing big data, ultimately leading to more informed and timely business decisions.
3. What role does data visualization play in the field of big data analytics?
Answer: Data visualization plays a critical role in big data analytics by converting complex data sets into visual representations like graphs, charts, and dashboards. These visual tools help stakeholders quickly grasp trends, anomalies, and relationships that might be difficult to detect in raw data. Effective visualization not only enhances comprehension but also facilitates data-driven decision-making by making insights more accessible to non-technical audiences. As a result, it bridges the gap between sophisticated analytics and strategic business action.
4. How does big data analytics support predictive analytics and forecasting?
Answer: Big data analytics supports predictive analytics and forecasting by leveraging historical and real-time data to build models that predict future trends. It uses statistical methods and machine learning algorithms to identify patterns that can forecast outcomes such as customer behavior, market trends, and operational risks. These predictive models enable organizations to plan more effectively and make proactive decisions. Ultimately, big data analytics transforms uncertainty into quantifiable insights that drive strategic planning.
5. What are the common challenges faced when working with big data?
Answer: Common challenges in big data analytics include managing the volume, velocity, and variety of data, often referred to as the three Vs. Data quality, integration from disparate sources, and storage scalability also pose significant hurdles. Moreover, ensuring data security and maintaining privacy are critical challenges that require robust governance and compliance measures. Addressing these issues necessitates advanced infrastructure, sophisticated tools, and continuous investment in data management technologies.
6. How can organizations ensure the accuracy and reliability of big data analytics results?
Answer: Organizations can ensure the accuracy and reliability of big data analytics results by implementing rigorous data cleansing and preprocessing procedures. Regular validation of data inputs, combined with continuous monitoring of model performance, helps maintain high standards of quality. Additionally, employing cross-validation techniques and benchmark testing ensures that the analytical models produce consistent and reliable predictions. These practices build trust in the analytics outcomes and support robust decision-making processes.
7. What is the significance of scalability in big data analytics systems?
Answer: Scalability in big data analytics systems is significant because it enables organizations to handle growing volumes of data without compromising performance. As data sources expand and data volumes increase, scalable systems ensure that processing and storage capabilities keep pace with demand. This adaptability is crucial for maintaining real-time analytics and supporting continuous business growth. Scalable systems also allow for cost-effective resource management by dynamically adjusting computing power and storage based on current needs.
8. How do cloud-based solutions contribute to big data analytics?
Answer: Cloud-based solutions contribute to big data analytics by providing flexible, scalable, and cost-efficient platforms for storing and processing large datasets. They enable organizations to quickly deploy analytics tools without the need for substantial upfront investment in infrastructure. Cloud solutions also facilitate collaboration and data sharing across geographically distributed teams, enhancing the speed and efficiency of data processing. This accessibility and scalability make cloud platforms an essential component in modern big data analytics strategies.
9. What impact does real-time data processing have on business intelligence?
Answer: Real-time data processing has a significant impact on business intelligence by allowing organizations to react swiftly to changing market conditions and operational challenges. It provides immediate insights that enable timely decision-making and proactive risk management. This capability is crucial in competitive environments where delays can lead to missed opportunities or increased exposure to threats. Real-time processing transforms data into actionable intelligence instantly, thereby enhancing the overall responsiveness and agility of the business.
10. How can big data analytics drive innovation and competitive advantage in various industries?
Answer: Big data analytics drives innovation and competitive advantage by uncovering hidden patterns and trends that inform new product development, targeted marketing, and operational improvements. It empowers organizations to make data-driven decisions that optimize resource allocation and improve customer experiences. By harnessing big data, companies can identify emerging market opportunities and stay ahead of industry trends. This proactive approach to innovation not only enhances efficiency but also positions organizations as leaders in their respective fields.
Big Data Analytics: Thought-Provoking Questions and Answers
1. How will advancements in artificial intelligence further transform big data analytics methodologies?
Answer: Advancements in artificial intelligence (AI) are expected to further transform big data analytics by enabling more sophisticated algorithms that can autonomously learn and adapt to complex data patterns. These advancements will allow for real-time, predictive analytics that not only process large datasets with greater efficiency but also generate insights that were previously unattainable. AI-driven models will improve accuracy in forecasting and anomaly detection, driving deeper and more actionable insights from big data.
In addition, as AI integrates more seamlessly with data analytics platforms, it will facilitate automated decision-making processes that reduce the need for human intervention in routine analyses. This transformation will empower organizations to quickly respond to dynamic market conditions and operational challenges, ensuring they remain agile and competitive in an increasingly data-centric world.
2. What are the ethical implications of extensive data collection in big data analytics, and how can organizations balance these concerns with the need for actionable insights?
Answer: Extensive data collection in big data analytics raises several ethical implications, including concerns about privacy, consent, and the potential misuse of personal information. Organizations must navigate the delicate balance between harnessing the power of big data and respecting individual rights. To address these concerns, it is essential to implement strong data governance frameworks that include clear privacy policies, anonymization techniques, and robust security measures.
By ensuring transparency in how data is collected and used, organizations can build trust with stakeholders while still leveraging the benefits of big data analytics. Regular audits, adherence to regulatory standards such as GDPR or CCPA, and ethical guidelines for data usage are critical in mitigating potential negative impacts and ensuring that data-driven insights are obtained responsibly.
3. How can big data analytics be leveraged to drive innovation in product development and customer experience?
Answer: Big data analytics can be leveraged to drive innovation in product development by identifying emerging trends, customer preferences, and market gaps that traditional methods might overlook. By analyzing vast amounts of customer data, organizations can uncover hidden insights that inform the creation of new products tailored to specific needs and desires. This approach not only enhances product relevance but also improves customer satisfaction by delivering solutions that resonate with target audiences.
In terms of customer experience, big data analytics enables personalized marketing, targeted communication, and real-time feedback mechanisms that enhance engagement and loyalty. By continuously monitoring and analyzing customer interactions across multiple touchpoints, companies can adapt their strategies to deliver more intuitive and effective experiences. The result is a more agile and innovative business model that continuously evolves based on data-driven insights.
4. What challenges do organizations face in integrating big data analytics with legacy systems, and what strategies can be implemented to overcome these obstacles?
Answer: Integrating big data analytics with legacy systems poses significant challenges, including compatibility issues, data silos, and limitations in processing power. Legacy systems often use outdated technologies and data formats that are not designed to handle the volume, velocity, and variety of modern big data. This can lead to difficulties in data integration, storage, and analysis, ultimately hampering the ability to derive actionable insights.
To overcome these obstacles, organizations can adopt a phased integration approach that involves modernizing legacy systems gradually. This may include the use of middleware solutions, data warehousing, and APIs to facilitate communication between old and new systems. Additionally, leveraging cloud-based platforms and scalable storage solutions can help bridge the gap, enabling organizations to harness big data analytics without completely overhauling existing infrastructure.
5. How can predictive analytics in big data transform risk management and decision-making processes?
Answer: Predictive analytics in big data transforms risk management by providing foresight into potential issues before they occur, enabling proactive measures rather than reactive responses. By analyzing historical data and identifying patterns, predictive models can forecast future risks, market trends, and customer behaviors with high accuracy. This allows organizations to allocate resources more effectively, optimize processes, and mitigate risks in a timely manner.
In decision-making, predictive analytics empowers leaders with actionable insights that support data-driven strategies. With the ability to anticipate market shifts and operational challenges, companies can make informed decisions that enhance competitiveness and drive long-term growth. This transformation ultimately leads to more resilient business practices that are better prepared for uncertainty and change.
6. What are the potential impacts of real-time analytics on operational efficiency and competitive advantage?
Answer: Real-time analytics have the potential to significantly enhance operational efficiency by providing immediate insights that enable rapid decision-making. Organizations can monitor live data streams to detect anomalies, respond to customer demands, and optimize supply chain operations without delay. This capability not only reduces downtime but also allows businesses to capitalize on emerging opportunities faster than their competitors.
Moreover, the ability to make informed decisions in real time can translate into a substantial competitive advantage. Companies that leverage real-time analytics can adapt their strategies on the fly, personalize customer interactions, and mitigate risks before they escalate, positioning themselves as agile and responsive market leaders in a dynamic business environment.
7. How might advancements in data visualization techniques improve the interpretability of complex big data insights?
Answer: Advancements in data visualization techniques have the potential to drastically improve the interpretability of complex big data insights by transforming intricate datasets into clear and intuitive visual representations. Innovative visualization tools such as interactive dashboards, 3D charts, and dynamic infographics enable users to explore data patterns and relationships with ease. These visualizations not only enhance understanding but also facilitate communication among stakeholders with diverse technical backgrounds.
By making data more accessible and engaging, advanced visualization techniques empower decision-makers to quickly grasp critical insights and act upon them. This enhanced interpretability bridges the gap between complex analytical outputs and practical business strategies, fostering a more data-driven culture within organizations.
8. How can organizations balance the need for data-driven insights with the potential risks of data overload and analysis paralysis?
Answer: Organizations can balance the need for data-driven insights with the risks of data overload by implementing effective data management strategies that prioritize quality over quantity. This involves establishing robust data governance frameworks that focus on collecting relevant, high-quality data and filtering out noise. By using advanced analytics tools and machine learning algorithms, organizations can automate the identification of key metrics and trends, thus reducing the cognitive burden on analysts.
Additionally, setting clear objectives and aligning data analytics efforts with strategic business goals helps ensure that only actionable insights are pursued. This focus minimizes the risk of analysis paralysis and allows decision-makers to concentrate on the most critical data, thereby driving efficient and effective business outcomes.
9. In what ways can collaborative platforms enhance the sharing and application of big data insights across organizations?
Answer: Collaborative platforms can enhance the sharing and application of big data insights by providing centralized environments where data, tools, and expertise are seamlessly integrated. These platforms facilitate real-time collaboration among data scientists, analysts, and decision-makers, enabling them to work together on complex datasets and share insights quickly. The integration of communication tools, version control, and data visualization features helps ensure that all stakeholders have access to the latest information and can contribute to refining analytical models.
Such collaboration fosters a culture of innovation and continuous improvement, as teams can learn from each other’s experiences and best practices. By breaking down silos and promoting transparency, collaborative platforms not only accelerate the pace of data-driven decision-making but also drive overall organizational growth and competitiveness.
10. How might the increasing volume and complexity of big data affect the scalability of analytics solutions, and what strategies can ensure long-term sustainability?
Answer: The increasing volume and complexity of big data pose significant challenges to the scalability of analytics solutions, as traditional systems may struggle to process and analyze vast datasets efficiently. This can result in slower response times, increased costs, and potential bottlenecks that hinder real-time analytics capabilities. To ensure long-term sustainability, organizations must adopt scalable architectures such as cloud computing, distributed processing, and parallel computing frameworks that can handle large-scale data workloads.
Investing in flexible, modular systems that can be upgraded or expanded as data volumes grow is also essential. Continuous monitoring, performance optimization, and regular system upgrades, along with robust data governance and management practices, will ensure that analytics solutions remain effective and sustainable over the long term.
11. What role does data quality play in deriving actionable insights from big data, and how can organizations ensure high data integrity?
Answer: Data quality is paramount in deriving actionable insights from big data because the accuracy, consistency, and reliability of analytical outcomes are directly tied to the quality of the input data. Poor data quality can lead to incorrect conclusions, misinformed strategies, and ultimately, significant business risks. Organizations can ensure high data integrity by implementing rigorous data cleaning, validation, and standardization processes throughout the data lifecycle.
Moreover, establishing strong data governance policies, employing advanced data integration tools, and conducting regular audits are critical to maintaining data quality. These measures not only improve the reliability of analytical insights but also build confidence among stakeholders that decisions are based on trustworthy and accurate information.
12. How can the integration of big data analytics and business intelligence transform decision-making processes at the executive level?
Answer: The integration of big data analytics and business intelligence transforms executive decision-making by providing comprehensive, real-time insights that are both actionable and strategically relevant. This convergence enables executives to access dynamic dashboards and predictive models that combine historical trends with forward-looking forecasts, offering a holistic view of the business environment. As a result, decision-makers can identify opportunities, mitigate risks, and allocate resources more effectively, driving enhanced operational performance and competitive advantage.
Furthermore, this integration promotes a culture of data-driven leadership, where strategic decisions are supported by robust quantitative evidence. The ability to quickly interpret complex data sets and translate them into clear business strategies empowers executives to respond swiftly to market changes and drive long-term organizational success.
Big Data Analytics: Numerical Problems and Solutions:
1. A big data analytics system processes 8,000,000 records per day. If a sampling method selects 5% of these records for analysis, calculate the sample size per day, then determine the total sample size for a month (30 days), and finally compute the average number of records per day in the monthly sample.
Solution:
• Step 1: Daily sample size = 8,000,000 × 0.05 = 400,000 records.
• Step 2: Monthly sample size = 400,000 × 30 = 12,000,000 records.
• Step 3: Average daily sample remains 400,000 records (12,000,000 ÷ 30 = 400,000).
2. A machine learning model is trained on a dataset of 500,000 examples and achieves an accuracy of 88%. Calculate the number of correctly predicted examples, the number of misclassified examples, and the misclassification rate percentage.
Solution:
• Step 1: Correct predictions = 500,000 × 0.88 = 440,000.
• Step 2: Misclassified examples = 500,000 – 440,000 = 60,000.
• Step 3: Misclassification rate = (60,000 ÷ 500,000) × 100 = 12%.
3. A data processing pipeline handles 150,000 records per hour. If processing efficiency is improved by 40%, calculate the new processing rate per hour, the total records processed in a 24-hour period before and after the improvement, and the percentage increase in daily processing.
Solution:
• Step 1: Improved rate = 150,000 × (1 + 0.40) = 210,000 records per hour.
• Step 2: Daily processing before = 150,000 × 24 = 3,600,000 records; after = 210,000 × 24 = 5,040,000 records.
• Step 3: Percentage increase = ((5,040,000 – 3,600,000) ÷ 3,600,000) × 100 = 40%.
4. A regression model has a mean absolute error (MAE) of $1,500. If improvements reduce the MAE by 30%, calculate the new MAE and the absolute reduction in error per prediction.
Solution:
• Step 1: Absolute reduction = $1,500 × 0.30 = $450.
• Step 2: New MAE = $1,500 – $450 = $1,050.
• Step 3: Absolute reduction remains $450 per prediction.
5. A data visualization dashboard displays 10 key metrics updated every 15 minutes. Calculate the number of updates per day, per month (30 days), and per year (365 days).
Solution:
• Step 1: Updates per day = (24 × 60) ÷ 15 = 96 updates.
• Step 2: Updates per month = 96 × 30 = 2,880 updates.
• Step 3: Updates per year = 96 × 365 = 35,040 updates.
6. A clustering algorithm segments 1,000,000 data points into 10 clusters. If one cluster contains 15% of the data, calculate the number of data points in that cluster, the number remaining for the other clusters, and the average number of data points per remaining cluster.
Solution:
• Step 1: Points in the cluster = 1,000,000 × 0.15 = 150,000.
• Step 2: Remaining points = 1,000,000 – 150,000 = 850,000.
• Step 3: Average per remaining cluster = 850,000 ÷ (10 – 1) = 850,000 ÷ 9 ≈ 94,444.44.
7. A predictive analytics model takes 0.004 seconds per prediction. If 2,500,000 predictions are made, calculate the total processing time in seconds, convert it to minutes, and then to hours.
Solution:
• Step 1: Total time = 2,500,000 × 0.004 = 10,000 seconds.
• Step 2: In minutes = 10,000 ÷ 60 ≈ 166.67 minutes.
• Step 3: In hours = 166.67 ÷ 60 ≈ 2.78 hours.
8. A company’s big data analytics project increases revenue by 8% on an initial revenue of $2,000,000. Calculate the revenue after the increase, the absolute revenue gain, and the percentage increase.
Solution:
• Step 1: Revenue gain = $2,000,000 × 0.08 = $160,000.
• Step 2: New revenue = $2,000,000 + $160,000 = $2,160,000.
• Step 3: Percentage increase = ( $160,000 ÷ $2,000,000) × 100 = 8%.
9. A dataset has 10 features and 1,000,000 records. If feature reduction techniques remove 30% of the features, calculate the new number of features, and the percentage reduction in the dataset’s dimensionality.
Solution:
• Step 1: Features removed = 10 × 0.30 = 3 features.
• Step 2: New feature count = 10 – 3 = 7 features.
• Step 3: Percentage reduction = (3 ÷ 10) × 100 = 30%.
10. A linear regression model is defined as y = 4x + 5. For x = 20, calculate the predicted y, then if the actual y is 90, compute the absolute error and the percentage error relative to the actual value.
Solution:
• Step 1: Predicted y = 4 × 20 + 5 = 80 + 5 = 85.
• Step 2: Absolute error = |90 – 85| = 5.
• Step 3: Percentage error = (5 ÷ 90) × 100 ≈ 5.56%.
11. A time series model forecasts monthly growth at a rate of 3% on an initial revenue of $500,000. Calculate the revenue after one month, after six months (compounded monthly), and the overall percentage growth over six months.
Solution:
• Step 1: Revenue after one month = $500,000 × 1.03 = $515,000.
• Step 2: Revenue after six months = $500,000 × (1.03)^6 ≈ $500,000 × 1.194052 = $597,026.
• Step 3: Overall percentage growth = (($597,026 – $500,000) ÷ $500,000) × 100 ≈ 19.41%.
12. A data science project reduces operational costs by 20% from an initial cost of $800,000 annually. Calculate the annual cost after reduction, the cost savings, and the ROI if the project investment is $100,000.
Solution:
• Step 1: Annual cost after reduction = $800,000 × (1 – 0.20) = $800,000 × 0.80 = $640,000.
• Step 2: Cost savings = $800,000 – $640,000 = $160,000.
• Step 3: ROI = ($160,000 ÷ $100,000) × 100 = 160%.