Dowemo

From: http://www. Hollischuang. com/archives/1716.

At present, almost many large web sites and applications are distributed deployment. The data consistency problem in distributed scene is a very important topic. The distributed cap theory tells us that any distributed system can't meet consistency ( consistency ), availability ( availability ) and partition tolerance ( partition tolerance ), which can only simultaneously meet two. "so many systems have to be at the beginning of the design. In most scenarios in the internet, you need to sacrifice strong consistency to swap the system high availability, and the system often needs to guarantee"final consistency", as long as the end time is acceptable to the user.

In many scenarios, we need a lot of technical solutions to support the final consistency of data, such as distributed transactions, distributed locks, etc. In the case, we need to ensure that a method can be executed only by the same thread at the same time. In a stand-alone environment, java provides a lot of apis, but these apis don't work in distributed scenarios. In other words, the ability to simply java apis doesn't provide the ability to distribute locks. So there are many schemes for distributed lock implementation.

For the implementation of distributed locks, there are currently several scenarios that are commonly used:

Implementation of distributed lock based on database

Implement distributed lock based on cache ( redis, memcached, tair )

Implementing distributed lock based on zookeeper

Before we analyze these implementations, we're going to take a look at what the distributed lock we need. ( here's a method lock for example, a resource lock )

In a distributed deployment application cluster, the same method can be executed only by one thread on a single machine at the same time.

This locks the lock if a ( avoid deadlock ).

This lock is best to be a blocking lock ( think of it according to business requirements )

High availability access lock and release lock

Get the performance of lock and release locks.

distributed locks based on database

based on database tables

The simplest way to implement a distributed lock is to create a lock table directly, and then implement it by manipulating the data in the table.

When we want to lock a method or resource, we'll add a record to the table and delete it when you want to release the lock.

Create a database table:

CREATE TABLE `methodLock` (


 `id` int(11) NOT NULL AUTO_INCREMENT COMMENT '主键',


 `method_name` varchar(64) NOT NULL DEFAULT '' COMMENT '锁定的方法名',


 `desc` varchar(1024) NOT NULL DEFAULT '备注信息',


 `update_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '保存数据时间,自动生成',


 PRIMARY KEY (`id`),


 UNIQUE KEY `uidx_method_name` (`method_name `) USING BTREE


) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='锁定中的方法';


When we want to lock a method, execute the following sql:

insert into methodLock(method_name,desc) values ('method_name','desc')


Because we've a uniqueness constraint for method_name, if multiple requests are submitted to the database, the database guarantees that only one operation can succeed.

When the method is completed, you need to perform the following sql when you want to release the lock:

delete from methodLock where method_name ='method_name'


This simple implementation has several issues:

1, this lock strongly depends on the availability of the database, which is a single point, which causes the business system to be unavailable once the database is suspended.

2, this lock has no expiration time, and once the unlock operation fails, the lock record is kept in the database, and the other threads cannot get.

3, this lock can only be because the insert operation of the data can be directly reported as soon as the insertion fails. A thread without a lock doesn't enter the queue queue, and to get the lock again to get the lock operation again.

4, this lock is, and the same thread cannot get the lock again before the lock is released. Because data in the data already exists.

Of course, we can have some other way to solve the problem above.

  • The database is single? two databases, synchronization before data. Once you hang up, go to the library.
  • There's no expiration time? just do a timing task and clean up the data from the database every time.
  • Not blocking? create a while loop until the insert successfully returns success.
  • in the database table, a field is added to record the host information and thread information for the machine that's currently locked, and then the next time the lock is acquired, the lock is assigned to him if the main information and thread information of the current machine can be traced to the database.
is based on database exclusive locks

In addition to removing records from the data table, you can also implement distributed locks by using the locks that are included in the data.

We use the database table we just created. Distributed locks can be realized through the database 's lock. Innodb engine based on mysql, you can use the following methods to implement lock operatio &:

public boolean lock(){


 connection.setAutoCommit(false)


 while(true){


 try{


 result = select * from methodLock where method_name=xxx for update;


 if(result==null){


 return true;


 }


 }catch(Exception e){



 }


 sleep(1000);


 }


 return false;


}


Adding for update after the query statement adds an exclusive lock to the database table during the query process ( more than one statement, which is when it's locked. Here we want to use locks to add indexes to method name, and note that the index must be created as a unique index; otherwise, multiple overloaded methods can't be accessed at the same time. It's recommended that you add the parameter type to the After a record is added to an exclusive lock, the other threads cannot add exclusive locks on the row record.

We can think of a thread that gets exclusive locks to obtain a distributed lock, and when you get a lock, you can perform the business logic of the method, after the method is executed, and then unlock it by using the following methods:

public void unlock(){


 connection.commit();


}


Lock the lock through the connection.commit() operation.

This approach can effectively solve problems that aren't released and blocking locks as mentioned above.

  • Blocking lock? the for update statement returns immediately after execution, and is stopped until the execution fails until successful.
  • The service is down after locking, and cannot be released? In this way, the database will release the lock itself after downtime.

But it isn't possible to solve and reentrant problems directly.

Another problem may also exist, although we use a unique index for method_name, and display using for update to use locks. However, mysql Choreographer optimizes queries, even if you use index fields in a condition, but using indexes to retrieve data is determined by mysql by judging the cost of different execution plans, if the mysql is more efficient, it'll use table locks rather than row locks in this case. If this happens, it's. ..

Another problem is that we want to use an exclusive lock lock lock lock, so a row of his lock isn't committed, and the database connection will be. Once a similar connection becomes more, the database connection pool can be compromised.

A summary of

Summarize how to use a database to implement a distributed lock, both of which depend on a table of the database, which determines whether the current lock exists by the record in the table, and the other is to achieve distributed locks through the database.

The advantages of the database to realize distributed lock

It's easy to understand directly with the database.

The disadvantage of distributed lock in database

There are a variety of problems that make the whole solution more complex in the process of solving problems.

The operating database requires a certain overhead, and performance issues need to be considered.

locks that use the database don't necessarily be trusted, especially when our lock table isn't.

distributed locks based on cache

Compared to the scheme based on database implementation, it can achieve better performance in performance based on caching. And many caching is a cluster deployment that can solve single problems.

Many mature cache products are currently available, including redis, memcached, and tair inside our company.

In this paper, we use tair to implement distributed locks with. There are many related articles on the network on redis and memcached, and there are some mature frameworks and algorithms that can be used directly.

A distributed lock based on tair is actually similar to redis, which is implemented by using TairManager.put method to implement.

public boolean trylock(String key) {


 ResultCode code = ldbTairManager.put(NAMESPACE, key, "This is a Lock.", 2, 0);


 if (ResultCode.SUCCESS.equals(code))


 return true;


 else


 return false;


}


public boolean unlock(String key) {


 ldbTairManager.invalid(NAMESPACE, key);


}


There are also several problems with the above implementation:

1, this lock has no expiration time, and once the unlock operation fails, the lock record is kept in tair, and the other threads cannot get the lock.

2, this lock can only be, whether success or failure is returned directly.

3, this lock is, and after a thread gets a lock, the lock cannot be acquired again before releasing the lock because the key used is already in tair. Couldn't perform put operation again.

Of course, there's a way to solve it.

  • No expiration time? tair 's put method supports incoming failure times and the data will be deleted automatically after arrival time.
  • while repeat execution.
  • after a thread gets to the lock, save the current host information and thread information, and before you get it, check that you're the owner of the current lock.

But the time of failure, I set up how long it's. How to set the failure time is too short, the method isn't completed, the lock is automatically released, then the concurrency problem will be generated. If the set time is too long, other threads that acquire locks may be flat and white. This issue uses the database to implement distributed locks.

A summary of

The cache can be used instead of database to implement distributed locks, which can provide better performance, while many caching services are cluster deployment. And many caching services provide methods that can be used to implement distributed locks, such as tair 's put method, 's setnx method, etc. Also, these caching services provide support for the data expiration, and you can set times to control the release of the lock directly.

Advantages of using caching to implement distributed locks

Good performance, easy to achieve.

The disadvantage of using caching to implement distributed locks

The failure time to control the lock by time isn't a.

to implement distributed locks based on zookeeper

Distributed locking algorithm based on zookeeper temporary sequential nodes.

The general idea is that each client is locked on a method, which is generated under the directory of the specified node corresponding to the method. A unique instantaneous sequential node Determines whether to acquire a lock. It's simple to determine the minimum number of sequences in an ordered node. When the lock is released, only this instantaneous node is removed. At the same time, the deadlock problem can be avoided by avoiding the lock caused by the service outage.

can't solve the problems mentioned earlier.

  • The lock cannot be released? using a zookeeper can effectively resolve the problem that locks can't be released because the client creates a temporary node in zk, once the client gets the lock, then the temporary node will automatically delete it, then the temporary node automatically deletes it. Other clients can get locks again.

  • Not blocking locks? use zookeeper to implement blocking locks, which can be used by clients to create a sequential node in zk and bind listeners on a node. Once the node has changed, the client can check that the node is in the current all nodes, and if so, you can perform the business logic and perform the business logic.

  • Not reentrant? using zookeeper can also be effective. When the client creates a node, the client 's host information and thread information are written directly to the node. If it's like your own information, you get to the lock directly, if you don't create a temporary sequential node, participate in queues.

  • A single problem? using zookeeper can effectively solve single point problems, and zk is cluster deployed, as long as there's a in the cluster, the service can be provided.

A zookeeper third-party library curator client is used to encapsulate a reentrant lock service in this client.

public boolean tryLock(long timeout, TimeUnit unit) throws InterruptedException {


 try {


 return interProcessMutex.acquire(timeout, unit);


 } catch (Exception e) {


 e.printStackTrace();


 }


 return true;


}


public boolean unlock() {


 try {


 interProcessMutex.release();


 } catch (Throwable e) {


 log.error(e.getMessage(), e);


 } finally {


 executorService.schedule(new Cleaner(client, path), delayTimeForClean, TimeUnit.MILLISECONDS);


 }


 return true;


}


A interprocessmutex provided by curator is the implementation of a distributed lock. Acquire method user gets lock, release method is used to release locks.

A distributed lock that's implemented with zk seems to be exactly the beginning of a distributed lock. However, it isn't true that a distributed lock actually has a disadvantage that it might not have a cache service so high. Because each time a lock and release lock is created, the instantaneous node is destroyed to implement the lock function. The creation and removal of nodes in zk can only be performed through the leader server, and then the data can't be found on all the follower machines.

In fact, it's possible to use zookeeper to bring concurrency problems, but it isn't common. Considering this situation, because of network dithering, the client can be disconnected from the session connection of zk cluster, then the temporary node will be. It's possible to generate concurrency problems. It isn't common because zk has a retry mechanism, once the zk cluster detects the heartbeat of the client, it'll retry, and the curator client supports multiple. It isn't possible to delete a temporary node after multiple attempts. ( therefore, choosing an appropriate retry strategy is also important to find a balance between the granularity and concurrency of the lock. )

A summary of

Advantages of using zookeeper to achieve distributed locks

Effective solution to single point problems, reentrant problems, problems, and locks that cannot be released. Make it simpler.

The disadvantage of using zookeeper to achieve distributed locks

Performance isn't as good as using caching to implement distributed locks. You need to understand the principles of zk.

Comparison of three schemes of

There are several ways to do that, which isn't perfect. Like cap, in complexity, reliability, performance, and so on can't be met at the same time, so choosing the best fit for the different application scenarios is.

From the understanding of the degree of ease of view ( from low to high ).

Database> cache> zookeeper

from the implementation complexity angle ( from low to high )

Zookeeper> = cache> database

from performance angle ( from highest to low )

Cache> zookeeper> = database

from the reliability angle ( from highest to lowest )

Zookeeper> cache> database




Copyright © 2011 Dowemo All rights reserved.    Creative Commons   AboutUs