Internet of Things (IoT) have revolutionized various fields by enabling the processing of vast amounts of data from an increasing number of IoT devices. While cloud-centric systems can efficiently store and analyze this data through centralized analytics, it may eventually reach its limits as the number of resource-constrained IoT devices grows. To address this issue, edge-enhanced systems alleviate the computational burden on the cloud for performance-critical analytics. However, IoT sensors often produce redundant data, leading to data management issues and limiting the efficiency of real-time data processing and transmission across networks. Edge-based data compression can address this issue by reducing the amount of data transmitted over networks, but it can also pose challenges in decoding data and performing inference on the server-side within acceptable latencies and Quality of Service (QoS). Preprocessing or compressing data at the edge before sending can alleviate some issues, but latency and network congestion may still persist while training remains centralized. Additionally, the large volume of streaming data from heterogeneous edge devices can make it difficult to guarantee high quality training data. This thesis investigates how the edge can alleviate the computational burden on the cloud for performance-critical analytics. We introduce edge-enhanced learning and analytics which combines representation learning, model partition, model compression, and model selection in 3 steps. Firstly, we introduce communication-efficient representation learning to overcome the limitations of static IoT systems and make them more adaptive to changing application requirements. Previous approaches have addressed the issues of data redundancy and data management using deep clustering and k-d tree algorithms in isolation. We combine these two methods to select minimally redundant and representative instances for communication-efficient data offloading. The proposed approach can handle both low-dimensional and high-dimensional data, and find a trade-off between computational complexity, transmission costs, and resource constraints at the edge. To process high-dimensional data, we applied an autoencoder-based coreset algorithm to reduce the data to a lower dimensional space called the k-diverse representions. The novelty of this approach comes from modifying an autoencoder model to produce multimodal k-diverse data consisting of k-diverse representations, normal instances, and anomalous instances allowing the cloud to receive the desired kind of minimally redundant data. The ability to combine multiple data summarisation goals further lowers the training cost by 66%. Secondly, we present QoS-aware edge-enhanced data compression learning to address the challenge of decoding the data and performing inference on the server-side within an acceptable Quality of Service (QoS). A principled compression learning method is designed and implemented for discovering the compression models that offer the right QoS for an application. It works by a novel modularisation approach that maps features to models and stratifies them for a range of models. An automated QoS-aware orchestrator has been designed to select, in real-time, the best autoencoder model for compressive offloading in edge-enhanced clouds based on changing QoS requirements. To our knowledge, this is one of the first attempts at harnessing the capabilities of autoencoders for edge-enhanced compressive offloading based on portable encodings, latent space splitting, and fine-tuning of network weights. The search strategy reduces the computational cost of searching through the entire space by up to 89%. When deployed on an edge-enhanced cloud in Azure IoT testbed, the approach saves up to 70% data transfer costs and takes 32% less time for job completion. It eliminates the additional computational cost of decompression, thereby reducing the processing cost by up to 30%. The third step of our approach further improves upon the first two by introducing distributed learning for real-time analytics. Although model learning can be performed offline in some environments, it can be prohibitive in distributed IoT systems due to data transfer costs, privacy, and low latency requirements. In this step, we focus on enhancing the capacity of large-scale streaming systems to support nearreal-time analytics via distributed learning. The edge-enhanced distributed learning system supports fully unsupervised systems that perform analytical tasks in near real-time such as anomaly detection and clustering through analyzing the difference between local and global models. To address the limitations of existing federated learning methods, asynchronous learning have been implemented to aggregate the model updates without waiting for stragglers. We apply a novel relevance-based local scheduling method that eliminates transmission of irrelevant and expensive updates. For incoming data streams, the proposed system is 34% faster in real-time learning without compromising accuracy using 10 times fewer data. The communication cost in our approach is up to 10 times smaller than the best available federated learning models. We believe that such systems will enable many practical real-time applications in IoT streaming systems where low-latency analytics are needed, and where useful labels are rare or extremely difficult to collect. This research paves the way for a range of unexplored possibilities at the intersection of machine learning and edge computing for edge-enhanced analytics in large-scale distributed systems.