Uploaded image for project: 'eZ Publish / Platform'
  1. eZ Publish / Platform
  2. EZP-29206

MySQL deadlock errors happen when concurrently moving subtree

    Details

    • Sprint:
      [2.2] Sprint 5

      Description

      When executing LocationService::moveSubtree() concurrently, MySQL deadlocks can happen.
      Example of such error:

      StatusCodeError: 500 - {"ErrorMessage":{"_media-type":"application/vnd.ez.api.ErrorMessage+json","errorCode":500,"errorMessage":"Internal Server Error","errorDescription":"Database error","Previous":{"_media-type":"application/vnd.ez.api.ErrorMessage+json","ErrorMessage":{"_media-type":"application/vnd.ez.api.ErrorMessage+json","errorCode":500,"errorMessage":"Internal Server Error","errorDescription":"An exception occurred while executing 'UPDATE `ezcontentobject` SET `section_id` = :placeholder2 WHERE `id` IN ( SELECT `contentobject_id` FROM `ezcontentobject_tree` WHERE `path_string` LIKE :placeholder1 )' with params [\"\\/1\\/2\\/58\\/157\\/%\", 1]:\n\nSQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction","Previous":{"_media-type":"application/vnd.ez.api.ErrorMessage+json","ErrorMessage":{"_media-type":"application/vnd.ez.api.ErrorMessage+json","errorCode":500,"errorMessage":"Internal Server Error","errorDescription":"SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction","Previous":{"_media-type":"application/vnd.ez.api.ErrorMessage+json","ErrorMessage":{"_media-type":"application/vnd.ez.api.ErrorMessage+json","errorCode":500,"errorMessage":"Internal Server Error","errorDescription":"SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction"}}}}}}}}
      

      The issue was reproduced by creating a PublishVersionSignalSlot that uses LocationService::moveSubtree() to move the freshly created Content Object to another location. Then, the script was used that:

      1. Logs in with n sessions (POST /user/sessions)
        THEN
      2. Creates n content items using each session (POST /content/objects)
        THEN
      3. Publishes the n content items created (POST /content/objects/<contentId>/versions/<versionId>)

      Steps to reproduce

      1. Create new eZ Platform installation.
      2. Create and publish new Content Object. Note its Location remote Id.
      3. Create the following files inside AppBundle. You can use the AppBundle attached to this issue (just remember to change the Location remote Id inside LocationTestSlot).
        • src/AppBundle/Slot/LocationTestSlot.php:

          <?php
           
          namespace AppBundle\Slot;
           
          use eZ\Publish\Core\SignalSlot\Slot as BaseSlot;
          use eZ\Publish\API\Repository\Repository;
          use eZ\Publish\Core\SignalSlot\Signal;
           
          class LocationTestSlot extends BaseSlot
          {
              /**
               * @var Repository
               */
              private $repository;
           
              public function __construct(
                  Repository $repository
              ) {
                  $this->repository = $repository;
              }
           
              public function receive(Signal $signal)
              {
                  $content = $this->repository->getContentService()->loadContent($signal->contentId, null, $signal->versionNo);
           
                  $primaryParentLocation = $this->repository->getLocationService()->loadLocationByRemoteId('50f2bd981799c968f07589fa94be245b');
                  $currentMainLocation = $this->repository->getLocationService()->loadLocation($content->contentInfo->mainLocationId);
           
                  $this->repository->getLocationService()->moveSubtree($currentMainLocation, $primaryParentLocation);
              }
          }
          

        • src/AppBundle/Resources/config/signalslot.yml:

          services:
              app.slot.locationtest:
                  class: AppBundle\Slot\LocationTestSlot
                  arguments:
                      - @ezpublish.api.repository
                  tags:
                      - { name: ezpublish.api.slot, signal: ContentService\PublishVersionSignal }
          

        • src/AppBundle/DependencyInjection/AppExtension.php:

          <?php
           
          namespace AppBundle\DependencyInjection;
           
          use Symfony\Component\DependencyInjection\ContainerBuilder;
          use Symfony\Component\Config\FileLocator;
          use Symfony\Component\HttpKernel\DependencyInjection\Extension;
          use Symfony\Component\DependencyInjection\Loader;
           
          class AppExtension extends Extension
          {
              /**
               * {@inheritDoc}
               */
              public function load(array $configs, ContainerBuilder $container)
              {
                  $loader = new Loader\YamlFileLoader($container, new FileLocator(__DIR__."/../Resources/config"));
                  $loader->load('signalslot.yml');
              }
          }
          

      4. Run the script to concurently create and publish multiple Content Objects. Such script written in node.js can be found in the attachment.
      5. Some of the requests should error out with the message mentioned above. Since the issue only happens in concurrency situations, there is no way to reproduce the issue every time. If there was no error, try to run the script again.

        Activity

        Hide
        Jacek Foremski (Inactive) added a comment -

        I tested the issue further and adding Locations to Content also causes deadlocks. An example of Slot to test this behavior:

        <?php
         
        namespace AppBundle\Slot;
         
        use eZ\Publish\Core\SignalSlot\Slot as BaseSlot;
        use eZ\Publish\API\Repository\Repository;
        use eZ\Publish\Core\SignalSlot\Signal;
         
        class LocationTestSlot extends BaseSlot
        {
            /**
             * @var Repository
             */
            private $repository;
         
            public function __construct(
                Repository $repository
            ) {
                $this->repository = $repository;
            }
         
            public function receive(Signal $signal)
            {
                $content = $this->repository->getContentService()->loadContent($signal->contentId, null, $signal->versionNo);
         
                $primaryParentLocation = $this->repository->getLocationService()->loadLocationByRemoteId('f29ceaa45f5d8b26b551c7cd523c4114');
         
                $locationCreateStruct = $this->repository->getLocationService()->newLocationCreateStruct($primaryParentLocation->id);
                $this->repository->getLocationService()->createLocation($content->contentInfo, $locationCreateStruct);
            }
        }
        
        

        This might mean that the issue is more broad and we need to check the whole system for such cases.

        Show
        Jacek Foremski (Inactive) added a comment - I tested the issue further and adding Locations to Content also causes deadlocks. An example of Slot to test this behavior: <?php   namespace AppBundle\Slot;   use eZ\Publish\Core\SignalSlot\Slot as BaseSlot; use eZ\Publish\API\Repository\Repository; use eZ\Publish\Core\SignalSlot\Signal;   class LocationTestSlot extends BaseSlot { /** * @var Repository */ private $repository ;   public function __construct( Repository $repository ) { $this ->repository = $repository ; }   public function receive(Signal $signal ) { $content = $this ->repository->getContentService()->loadContent( $signal ->contentId, null, $signal ->versionNo);   $primaryParentLocation = $this ->repository->getLocationService()->loadLocationByRemoteId( 'f29ceaa45f5d8b26b551c7cd523c4114' );   $locationCreateStruct = $this ->repository->getLocationService()->newLocationCreateStruct( $primaryParentLocation ->id); $this ->repository->getLocationService()->createLocation( $content ->contentInfo, $locationCreateStruct ); } } This might mean that the issue is more broad and we need to check the whole system for such cases.
        Hide
        Jacek Foremski (Inactive) added a comment -

        It seems that the deadlocks are reproducible even without using the Signal Slot (meaning that step 3 can be skipped). It just happens very rarely - I tested this using 10 concurrent processes and the deadlock happened for one process on the 13th test. More information about the reproduced deadlock:

        ------------------------
        LATEST DETECTED DEADLOCK
        ------------------------
        2018-05-22 10:35:01 0x3d18
        *** (1) TRANSACTION:
        TRANSACTION 933501, ACTIVE 2 sec inserting
        mysql tables in use 1, locked 1
        LOCK WAIT 21 lock struct(s), heap size 1136, 22 row lock(s), undo log entries 16
        MySQL thread id 4683, OS thread handle 16500, query id 371480 localhost ::1 root update
        INSERT INTO `ezurlalias_ml` (`id`, `link`, `parent`, `action`, `lang_mask`, `text`, `text_md5`, `is_original`, `is_alias`, `alias_redirects`, `action_type`) VALUES ('3121', '3121', 0, 'eznode:678', 2, 'Create-test-2445', 'aa15a7a4fd783842c6851098f2221c30', 1, 0, 0, 'eznode')
        *** (1) WAITING FOR THIS LOCK TO BE GRANTED:
        RECORD LOCKS space id 31633 page no 21 n bits 176 index PRIMARY of table `ezplatform`.`ezurlalias_ml` trx id 933501 lock_mode X locks gap before rec insert intention waiting
        Record lock, heap no 106 PHYSICAL RECORD: n_fields 13; compact format; info bits 0
         0: len 4; hex 80000000; asc     ;;
         1: len 30; hex 616134386465666165373764623331356531656461376536366232343535; asc aa48defae77db315e1eda7e66b2455; (total 32 bytes);
         2: len 6; hex 0000000e3e75; asc     >u;;
         3: len 7; hex b600000157012c; asc     W ,;;
         4: len 10; hex 657a6e6f64653a363730; asc eznode:670;;
         5: len 6; hex 657a6e6f6465; asc eznode;;
         6: len 4; hex 80000000; asc     ;;
         7: len 4; hex 80000bff; asc     ;;
         8: len 4; hex 80000000; asc     ;;
         9: len 4; hex 80000001; asc     ;;
         10: len 8; hex 8000000000000002; asc         ;;
         11: len 4; hex 80000bff; asc     ;;
         12: len 16; hex 4372656174652d746573742d32343337; asc Create-test-2437;;
         
        *** (2) TRANSACTION:
        TRANSACTION 933502, ACTIVE 1 sec inserting, thread declared inside InnoDB 5000
        mysql tables in use 1, locked 1
        24 lock struct(s), heap size 3520, 22 row lock(s), undo log entries 16
        MySQL thread id 4680, OS thread handle 15640, query id 371492 localhost ::1 root update
        INSERT INTO `ezurlalias_ml` (`id`, `link`, `parent`, `action`, `lang_mask`, `text`, `text_md5`, `is_original`, `is_alias`, `alias_redirects`, `action_type`) VALUES ('3124', '3124', 0, 'eznode:679', 2, 'Create-test-2445', 'aa15a7a4fd783842c6851098f2221c30', 1, 0, 0, 'eznode')
        *** (2) HOLDS THE LOCK(S):
        RECORD LOCKS space id 31633 page no 21 n bits 176 index PRIMARY of table `ezplatform`.`ezurlalias_ml` trx id 933502 lock mode S
        Record lock, heap no 106 PHYSICAL RECORD: n_fields 13; compact format; info bits 0
         0: len 4; hex 80000000; asc     ;;
         1: len 30; hex 616134386465666165373764623331356531656461376536366232343535; asc aa48defae77db315e1eda7e66b2455; (total 32 bytes);
         2: len 6; hex 0000000e3e75; asc     >u;;
         3: len 7; hex b600000157012c; asc     W ,;;
         4: len 10; hex 657a6e6f64653a363730; asc eznode:670;;
         5: len 6; hex 657a6e6f6465; asc eznode;;
         6: len 4; hex 80000000; asc     ;;
         7: len 4; hex 80000bff; asc     ;;
         8: len 4; hex 80000000; asc     ;;
         9: len 4; hex 80000001; asc     ;;
         10: len 8; hex 8000000000000002; asc         ;;
         11: len 4; hex 80000bff; asc     ;;
         12: len 16; hex 4372656174652d746573742d32343337; asc Create-test-2437;;
         
        Record lock, heap no 107 PHYSICAL RECORD: n_fields 13; compact format; info bits 0
         0: len 4; hex 80000000; asc     ;;
         1: len 30; hex 386338643239313238666639326364396331376535613966383030343337; asc 8c8d29128ff92cd9c17e5a9f800437; (total 32 bytes);
         2: len 6; hex 0000000e3e78; asc     >x;;
         3: len 7; hex b9000001290156; asc     ) V;;
         4: len 10; hex 657a6e6f64653a363733; asc eznode:673;;
         5: len 6; hex 657a6e6f6465; asc eznode;;
         6: len 4; hex 80000000; asc     ;;
         7: len 4; hex 80000c0c; asc     ;;
         8: len 4; hex 80000000; asc     ;;
         9: len 4; hex 80000001; asc     ;;
         10: len 8; hex 8000000000000002; asc         ;;
         11: len 4; hex 80000c0c; asc     ;;
         12: len 16; hex 4372656174652d746573742d32343430; asc Create-test-2440;;
         
        *** (2) WAITING FOR THIS LOCK TO BE GRANTED:
        RECORD LOCKS space id 31633 page no 21 n bits 176 index PRIMARY of table `ezplatform`.`ezurlalias_ml` trx id 933502 lock_mode X locks gap before rec insert intention waiting
        Record lock, heap no 106 PHYSICAL RECORD: n_fields 13; compact format; info bits 0
         0: len 4; hex 80000000; asc     ;;
         1: len 30; hex 616134386465666165373764623331356531656461376536366232343535; asc aa48defae77db315e1eda7e66b2455; (total 32 bytes);
         2: len 6; hex 0000000e3e75; asc     >u;;
         3: len 7; hex b600000157012c; asc     W ,;;
         4: len 10; hex 657a6e6f64653a363730; asc eznode:670;;
         5: len 6; hex 657a6e6f6465; asc eznode;;
         6: len 4; hex 80000000; asc     ;;
         7: len 4; hex 80000bff; asc     ;;
         8: len 4; hex 80000000; asc     ;;
         9: len 4; hex 80000001; asc     ;;
         10: len 8; hex 8000000000000002; asc         ;;
         11: len 4; hex 80000bff; asc     ;;
         12: len 16; hex 4372656174652d746573742d32343337; asc Create-test-2437;;
         
        *** WE ROLL BACK TRANSACTION (1)
        

        Using the repository in the Slot listening to PublishVersionSignal just makes the issue happen more often (and, for example when using LocationService::moveSubtree() method, the deadlocks happen for another table - ezcontentobject instead of ezurlalias_ml ).

        Show
        Jacek Foremski (Inactive) added a comment - It seems that the deadlocks are reproducible even without using the Signal Slot (meaning that step 3 can be skipped). It just happens very rarely - I tested this using 10 concurrent processes and the deadlock happened for one process on the 13th test. More information about the reproduced deadlock: ------------------------ LATEST DETECTED DEADLOCK ------------------------ 2018-05-22 10:35:01 0x3d18 *** (1) TRANSACTION: TRANSACTION 933501, ACTIVE 2 sec inserting mysql tables in use 1, locked 1 LOCK WAIT 21 lock struct(s), heap size 1136, 22 row lock(s), undo log entries 16 MySQL thread id 4683, OS thread handle 16500, query id 371480 localhost ::1 root update INSERT INTO `ezurlalias_ml` (`id`, `link`, `parent`, `action`, `lang_mask`, `text`, `text_md5`, `is_original`, `is_alias`, `alias_redirects`, `action_type`) VALUES ('3121', '3121', 0, 'eznode:678', 2, 'Create-test-2445', 'aa15a7a4fd783842c6851098f2221c30', 1, 0, 0, 'eznode') *** (1) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 31633 page no 21 n bits 176 index PRIMARY of table `ezplatform`.`ezurlalias_ml` trx id 933501 lock_mode X locks gap before rec insert intention waiting Record lock, heap no 106 PHYSICAL RECORD: n_fields 13; compact format; info bits 0 0: len 4; hex 80000000; asc ;; 1: len 30; hex 616134386465666165373764623331356531656461376536366232343535; asc aa48defae77db315e1eda7e66b2455; (total 32 bytes); 2: len 6; hex 0000000e3e75; asc >u;; 3: len 7; hex b600000157012c; asc W ,;; 4: len 10; hex 657a6e6f64653a363730; asc eznode:670;; 5: len 6; hex 657a6e6f6465; asc eznode;; 6: len 4; hex 80000000; asc ;; 7: len 4; hex 80000bff; asc ;; 8: len 4; hex 80000000; asc ;; 9: len 4; hex 80000001; asc ;; 10: len 8; hex 8000000000000002; asc ;; 11: len 4; hex 80000bff; asc ;; 12: len 16; hex 4372656174652d746573742d32343337; asc Create-test-2437;;   *** (2) TRANSACTION: TRANSACTION 933502, ACTIVE 1 sec inserting, thread declared inside InnoDB 5000 mysql tables in use 1, locked 1 24 lock struct(s), heap size 3520, 22 row lock(s), undo log entries 16 MySQL thread id 4680, OS thread handle 15640, query id 371492 localhost ::1 root update INSERT INTO `ezurlalias_ml` (`id`, `link`, `parent`, `action`, `lang_mask`, `text`, `text_md5`, `is_original`, `is_alias`, `alias_redirects`, `action_type`) VALUES ('3124', '3124', 0, 'eznode:679', 2, 'Create-test-2445', 'aa15a7a4fd783842c6851098f2221c30', 1, 0, 0, 'eznode') *** (2) HOLDS THE LOCK(S): RECORD LOCKS space id 31633 page no 21 n bits 176 index PRIMARY of table `ezplatform`.`ezurlalias_ml` trx id 933502 lock mode S Record lock, heap no 106 PHYSICAL RECORD: n_fields 13; compact format; info bits 0 0: len 4; hex 80000000; asc ;; 1: len 30; hex 616134386465666165373764623331356531656461376536366232343535; asc aa48defae77db315e1eda7e66b2455; (total 32 bytes); 2: len 6; hex 0000000e3e75; asc >u;; 3: len 7; hex b600000157012c; asc W ,;; 4: len 10; hex 657a6e6f64653a363730; asc eznode:670;; 5: len 6; hex 657a6e6f6465; asc eznode;; 6: len 4; hex 80000000; asc ;; 7: len 4; hex 80000bff; asc ;; 8: len 4; hex 80000000; asc ;; 9: len 4; hex 80000001; asc ;; 10: len 8; hex 8000000000000002; asc ;; 11: len 4; hex 80000bff; asc ;; 12: len 16; hex 4372656174652d746573742d32343337; asc Create-test-2437;;   Record lock, heap no 107 PHYSICAL RECORD: n_fields 13; compact format; info bits 0 0: len 4; hex 80000000; asc ;; 1: len 30; hex 386338643239313238666639326364396331376535613966383030343337; asc 8c8d29128ff92cd9c17e5a9f800437; (total 32 bytes); 2: len 6; hex 0000000e3e78; asc >x;; 3: len 7; hex b9000001290156; asc ) V;; 4: len 10; hex 657a6e6f64653a363733; asc eznode:673;; 5: len 6; hex 657a6e6f6465; asc eznode;; 6: len 4; hex 80000000; asc ;; 7: len 4; hex 80000c0c; asc ;; 8: len 4; hex 80000000; asc ;; 9: len 4; hex 80000001; asc ;; 10: len 8; hex 8000000000000002; asc ;; 11: len 4; hex 80000c0c; asc ;; 12: len 16; hex 4372656174652d746573742d32343430; asc Create-test-2440;;   *** (2) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 31633 page no 21 n bits 176 index PRIMARY of table `ezplatform`.`ezurlalias_ml` trx id 933502 lock_mode X locks gap before rec insert intention waiting Record lock, heap no 106 PHYSICAL RECORD: n_fields 13; compact format; info bits 0 0: len 4; hex 80000000; asc ;; 1: len 30; hex 616134386465666165373764623331356531656461376536366232343535; asc aa48defae77db315e1eda7e66b2455; (total 32 bytes); 2: len 6; hex 0000000e3e75; asc >u;; 3: len 7; hex b600000157012c; asc W ,;; 4: len 10; hex 657a6e6f64653a363730; asc eznode:670;; 5: len 6; hex 657a6e6f6465; asc eznode;; 6: len 4; hex 80000000; asc ;; 7: len 4; hex 80000bff; asc ;; 8: len 4; hex 80000000; asc ;; 9: len 4; hex 80000001; asc ;; 10: len 8; hex 8000000000000002; asc ;; 11: len 4; hex 80000bff; asc ;; 12: len 16; hex 4372656174652d746573742d32343337; asc Create-test-2437;;   *** WE ROLL BACK TRANSACTION (1) Using the repository in the Slot listening to PublishVersionSignal just makes the issue happen more often (and, for example when using LocationService::moveSubtree() method, the deadlocks happen for another table - ezcontentobject instead of ezurlalias_ml ).
        Hide
        Andrzej Longosz added a comment -

        One possible solution, suggested in CS-6825: https://github.com/ezsystems/ezpublish-kernel/pull/2339

        Show
        Andrzej Longosz added a comment - One possible solution, suggested in CS-6825: https://github.com/ezsystems/ezpublish-kernel/pull/2339
        Hide
        Andrzej Longosz added a comment -

        Merged into 6.7@a85f61a and up (6.13, 7.1, master)

        Show
        Andrzej Longosz added a comment - Merged into 6.7@a85f61a and up ( 6.13 , 7.1 , master )
        Hide
        Marek Nocoń added a comment - - edited

        Tested on 1.7 and 1.13.2 with patch so far (retest on v2 incoming).
        Another issue created: https://jira.ez.no/browse/EZP-29257 (does not happen when using Redis as cache).
        This one is QA Approved.

        EDIT: Looks good on v2 too

        Show
        Marek Nocoń added a comment - - edited Tested on 1.7 and 1.13.2 with patch so far (retest on v2 incoming). Another issue created: https://jira.ez.no/browse/EZP-29257 (does not happen when using Redis as cache). This one is QA Approved. EDIT: Looks good on v2 too

          People

          • Assignee:
            Unassigned
            Reporter:
            Jacek Foremski (Inactive)
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Agile