Abstract

With the development of the IoT (Internet of Things), sensors networks can bring a large amount of valuable data. In addition to be utilized in the local IoT applications, the data can also be traded in the connected edge servers. As an efficient resource allocation mechanism, the double auction has been widely used in the stock and futures markets and can be also applied in the data resource allocation in sensor networks. Currently, there usually exist multiple edge servers running double auctions competing with each other to attract data users (buyers) and producers (sellers). Therefore, the double auction market run on each edge server needs efficient mechanism to improve the allocation efficiency. Specifically, the pricing strategy of the double auction plays an important role on affecting traders’ profit, and thus, will affect the traders’ market choices and bidding strategies, which in turn affect the competition result of double auction markets. In addition, the traders’ trading strategies will also affect the market’s pricing strategy. Therefore, we need to analyze the double auction markets’ pricing strategy and traders’ trading strategies. Specifically, we use a deep reinforcement learning algorithm combined with mean field theory to solve this problem with a huge state and action space. For trading strategies, we use the Independent Parametrized Deep Q-Network (I-PDQN) algorithm combined with mean field theory to compute the Nash equilibrium strategies. We then compare it with the fictitious play (FP) algorithm. The experimental results show that the computation speed of I-PDQN algorithm is significantly faster than that of FP algorithm. For pricing strategies, the double auction markets will dynamically adjust the pricing strategy according to traders’ trading strategies. This is a sequential decision-making process involving multiple agents. Therefore, we model it as a Markov game. We adopt Multiagent Deep Deterministic Policy Gradient (MADDPG) algorithm to analyze the Nash equilibrium pricing strategies. The experimental results show that the MADDPG algorithm solves the problem faster than the FP algorithm.

Highlights

  • With the development of the IoT (Internet of Things), smart terminals embedded with a large number of sensors such as cameras, GPS, and gyroscopes are becoming more and more common in daily life [1], where massive amounts of data are collected [2]

  • We find that the Nash equilibrium pricing strategy obtained by this algorithm is the same as the solution of fictitious play (FP) algorithm, and the Multiagent Deep Deterministic Policy Gradient (MADDPG) algorithm can solve the problem faster than the FP algorithm

  • The results show that the Independent Parametrized Deep Q-Network (I-PDQN) algorithm has more iterations when converging to the equilibrium, the single iteration computation time of FP is about 5.031 times that of I-PDQN algorithm, and the total average time of FP algorithm is 4.6745 times that of Independent Parametrized Deep Q-Network (IPDQN) algorithm

Read more

Summary

Introduction

With the development of the IoT (Internet of Things), smart terminals embedded with a large number of sensors such as cameras, GPS, and gyroscopes are becoming more and more common in daily life [1], where massive amounts of data are collected [2]. Traffic information can be collected from the smartphone to edge server, which can be sold to some navigation applications for optimizing the route planning In this scenario, double auction, as an auction mechanism in which there are multiple buyers and sellers (referred as traders in the following) in the market can be used for trading data between data users (buyers) and data generators (sellers) by the edge server. We need to analyze the trading strategies of traders and pricing strategies of double auctions in the environment with multiple competing edge servers running double auction markets. We will analyze the Nash equilibrium trading strategies and pricing strategies in this competing environment This problem involves a large number of traders, which may have continuous bidding space and private preference.

Related Work
Basic Settings
Nash Equilibrium Trading Strategy
Initialization
The Competing Pricing Strategy
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call