Skip to content

systemd dependency on db#3973

Merged
DaanHoogland merged 7 commits intoapache:4.13from
shapeblue:systemd-file
Mar 25, 2020
Merged

systemd dependency on db#3973
DaanHoogland merged 7 commits intoapache:4.13from
shapeblue:systemd-file

Conversation

@DaanHoogland
Copy link
Copy Markdown
Contributor

@DaanHoogland DaanHoogland commented Mar 16, 2020

Description

Fixes: #3954

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

Screenshots (if appropriate):

How Has This Been Tested?

@DaanHoogland
Copy link
Copy Markdown
Contributor Author

@rvalle @rhtyd what do you think?

@DaanHoogland DaanHoogland added this to the 4.13.1.0 milestone Mar 16, 2020
@rvalle
Copy link
Copy Markdown
Contributor

rvalle commented Mar 16, 2020

@DaanHoogland here are my observations:

  • Where is that file going to be installed? /lib/systemd ?
  • How does it work if the user runs mysql on another system, and not where the CS Manager runs?
  • In my system the targets are called mysql.service are you sure it is mariadb.service?
  • are you sure you can use Wants instead of Requires? did you check the meaning? I found that different terms imply different types of dependencies. I have checked that Requires works for our case. check https://unix.stackexchange.com/questions/388586/systemd-requires-vs-wants

I was thinking on creating a complementary dependency (in /etc/systemd...) from the cloudstack-setup-databases.

We could create it once the admin requests a configuration with mysql at localhost. then we know that mysql or mariadb run on the same system

Then we could add a switch for cloudstack-setup-databases that disables or forces this configuration.

@rvalle
Copy link
Copy Markdown
Contributor

rvalle commented Mar 17, 2020

@DaanHoogland I have an important delivery at work at the end of month, but after that I can help you implement this.

@DaanHoogland
Copy link
Copy Markdown
Contributor Author

@rvalle in the case of Wants, if the dependency is present it will be started before the dependant, and shutdown after. if it is not or if it fails to start the dependant is started anyway. (see #3954 (comment))
so

  • when it is present on the system it works like requires, and
  • when the user decides to install it on another system it will start anyway (but the user is responsible for proper startup and shutdown procedures)

If I interpret the docs correctly we can add the mysql.target with no harm. if either is present they will be started. The user must make sure not both are present though, to avoid conflict.

I can see how you envision a more elegant solution, but I would opt for next release for that, minimising code change during freeze and hastening release. If this can be tested and shown to work.... I would prefer to be working on the next release by the end of the month ;)

@rvalle
Copy link
Copy Markdown
Contributor

rvalle commented Mar 17, 2020

@DaanHoogland, no no, I think your solution is better.

No need to modify the script.

I will try to test your proposal. Stopping mysql/mariadb should stop the CS Management too.

And how do we do with the mysql/mariadb thing? the service units have different names, right?

Or is it recommended to go with mariadb in general?

@rvalle
Copy link
Copy Markdown
Contributor

rvalle commented Mar 17, 2020

@DaanHoogland also, is it .target or .service?

I understand .target as a group of units, but is mariadb a target?

@DaanHoogland
Copy link
Copy Markdown
Contributor Author

DaanHoogland commented Mar 17, 2020

@rvalle I have not looked into the diff between target and service yet, missed that one. Need to look. Actually i may have completely goofed there. it probably should be db.target and then it should be filled accordingly. My systemd is not fluent obviously.

@rvalle
Copy link
Copy Markdown
Contributor

rvalle commented Mar 17, 2020

@DaanHoogland I will try to test tonight. might also switch to mariadb or test if the solution works with both databases.

@DaanHoogland
Copy link
Copy Markdown
Contributor Author

@rvalle my use of target is due to an overzealous c&p action. I think it should be .service in both cases unless we decide to make it more versatile by having a db.target that can deal with either and will handle the case of both gracefully.

I'll update the PR to read service.

@DaanHoogland
Copy link
Copy Markdown
Contributor Author

did you have a chance to test @rvalle ?

@Spaceman1984
Copy link
Copy Markdown
Contributor

I edited the cloudstack-management.service file:

[Unit]
Description=CloudStack Management Server
After=syslog.target network.target mysqld.service
Wants=mysqld.service

and ran

systemctl start cloudstack-management

In this case, mysqld started before cloudstack-management, but after executing:

systemctl stop mysqld

to see what happens when mysqld is stopped, cloudstack-management didn't stop and started throwing errors.

By chaning the cloudstack-management.service file to:

[Unit]
Description=CloudStack Management Server
After=syslog.target network.target mysqld.service
Requires=mysqld.service

and executing

systemctl stop mysqld

cloudstack was stopped first before mysql was stopped.

By my testing, using "Wants", instead of "Requires", mysqld will start before cloudstack-management, but will not shut down cloudstack-management before mysqld.

I will take a look at a solution

@rvalle
Copy link
Copy Markdown
Contributor

rvalle commented Mar 21, 2020

@Spaceman1984 did you remember the daemon-reload after the changes?

@rvalle
Copy link
Copy Markdown
Contributor

rvalle commented Mar 21, 2020

@DaanHoogland I will test this weekend

@rvalle
Copy link
Copy Markdown
Contributor

rvalle commented Mar 21, 2020

@DaanHoogland I managed to get some testing now...
@Spaceman1984 you were right.... adding Wants/After and stopping mysql does not stop CSM.
However adding Requires/After DOES work... which has some other implications.

This seems to imply that Wants/Requires do have semantic differences other than the ones we have discussed before... I am back to the Systemd documentation... see if I can find the details.

@rvalle
Copy link
Copy Markdown
Contributor

rvalle commented Mar 21, 2020

This is the best I can find from Redhat 7

Requires : Configures dependencies on other units. The units listed in Requires are activated together with the unit. If any of the required units fail to start, the unit is not activated.

Wants: Configures weaker dependencies than Requires. If any of the listed units does not start successfully, it has no impact on the unit activation. This is the recommended way to establish custom unit dependencies.

Looks like Wants would also start CSM if mysql is present but fails to start. I am thinking that manually stopping the service could also be interpreted as failure (?). I am not sure, but Requires is a much safer relationship for our use case.

@rvalle
Copy link
Copy Markdown
Contributor

rvalle commented Mar 21, 2020

I have continued testing a bit more.
The general case of starting and stopping the system, were both services are properly configured and work, is OK. mysql and CSM are started/stopped in the right order.

However, I can attempt to start CSM with mysql stopped and it will not start it automatically either.

I wonder what would happen in a situation in which CSM takes some time to stop, like just after the install.

I am going to try to run my to run the following test:

  • Install CSM
  • reboot right after finished.
  • Reestart
    On installation CSM does a lot of database operations and a restart right after it would break the database. With "Require" dependency it does not break, will it be the same with Wants?

@weizhouapache
Copy link
Copy Markdown
Member

@rvalle what if mariadb/mysql server run on another server ?

@rvalle
Copy link
Copy Markdown
Contributor

rvalle commented Mar 21, 2020

ok I did run the install test with restart after install.
The database was not corrupted with Wants.

So, this is the status:

Wants seems sufficient for startup, shutdown and reboot. While still works if no mysql is installed. as @weizhouapache is pointing.

Requires is stronger and will deal with things like shutting down CSM automatically if you happen to stop mysql (perhaps during a mysql update). Will also start mysql automatically at CSM startup even if is stopped, or will not start CSM if mysql is misconfigured. However it does require mysql to be present in the system. And you need to know if it is mysql or mariadb.

What we can do from here?

I think we should include Wants in the distribution file, because it causes no harm, and serves as a safeward. It will work and provide protection for most of the use cases.

We still have the option to modify the service unit from /etc/systemd as discussed in #3954 adding a Requires once we know that the mysql/mariadb runs on localhost for sure. This could be dome from the script cloudstack-setup-databases.

Another posibilty is simply to include an additional comment with instructions in the at the distrubuted Systemd. This comment could be extended to other depndencies, something like this in the provided Systemd script:

# Do not modify this file as your changes will be lost in the next CSM update.
# If you need to add specific dependencies to this service unit do it in the
# /etc/systemd/system/cloudstack-management.service.d/ directory
# If you know that either mysql or mariadb is running on this system is a good idea to
# add a Requires dependency there. Check systemd documentation for details

@DaanHoogland DaanHoogland marked this pull request as ready for review March 23, 2020 07:59
@Spaceman1984
Copy link
Copy Markdown
Contributor

Code LGTM

Copy link
Copy Markdown
Member

@weizhouapache weizhouapache left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code LGTM

@Spaceman1984
Copy link
Copy Markdown
Contributor

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@Spaceman1984 a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✖centos6 ✔centos7 ✔debian. JID-1076

@Spaceman1984
Copy link
Copy Markdown
Contributor

Spaceman1984 commented Mar 23, 2020

Tested with mysqld, the config behaves as expected. @rvalle I tried to corrupt the db, but I was only able to do it while the upgrade scripts were running. Do you do the restart while the upgrade script is executing or after?

I executed:

systemctl stop cloudstack-management
cloudstack-setup-databases cloud:cloud@localhost --deploy-as=root
systemctl start cloudstack-management
tail -f /var/log/cloudstack/management/management.log

I watch the log until the SQL queries stop being logged then I restart.

At this point cloudstack seems to fire up properly, I didn't see any errors in the log.

But If I restart while the SQL scripts are being logged Cloudstack won't start up and I have to run the db setup script again to initialize the db.

@DaanHoogland
Copy link
Copy Markdown
Contributor Author

@Spaceman1984 I could read your comment as a LGTM or as a request for changes, please pass a verdict!?!

@rvalle
Copy link
Copy Markdown
Contributor

rvalle commented Mar 24, 2020

@Spaceman1984 my use case is just to deploy CSM with ansible and reboot the system at the end, as part of some testing.

My test was trying to prove that after a reboot the server was in the exact same state as after the ansible playbooks run. The goal is trying to prove that after a reboot everything works as expect.

But then I noticed that all those tests would corrupt the database, so I started digging into this.
Personally I have decided to go ahead with the additional Requires dependency on /etc/systemd, as I think suits best my particular deployment.

But Still think that this PR is best for all cases. Someone installing CSM is going to tune many other things too.

@andrijapanicsb andrijapanicsb changed the title systemd dependency on db systemd dependency on mysql Mar 24, 2020
@DaanHoogland DaanHoogland changed the title systemd dependency on mysql systemd dependency on db Mar 24, 2020
@Spaceman1984
Copy link
Copy Markdown
Contributor

LGTM @DaanHoogland

@andrijapanicsb
Copy link
Copy Markdown
Contributor

@DaanHoogland you'll need to please add "mysqld" as the dependency as well (mysql happens to be an alias for mysqld 5.6 on CentOS7- what @Spaceman1984 tested) - as "mysqld" is the unit name for mysql 5.7 and there is no such alias

Tested by adding "mysql.service" alias:
/usr/lib/systemd/system/mysqld.service (edited)
[Install]
WantedBy=multi-user.target
Alias=mysql.service

LGTM (after adding "mysqld")

@DaanHoogland
Copy link
Copy Markdown
Contributor Author

I can add mysqld but it makes more sense to me replacing mysql with mysqld, after reading your comment. If mysql is an alias for mysqld these are not both needed.

@andrijapanicsb
Copy link
Copy Markdown
Contributor

on Ubuntu its mysql so I prefer to have both (all 3 that is)

@Spaceman1984
Copy link
Copy Markdown
Contributor

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@Spaceman1984 a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✖centos6 ✔centos7 ✔debian. JID-1085

Copy link
Copy Markdown
Contributor

@andrijapanicsb andrijapanicsb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@andrijapanicsb
Copy link
Copy Markdown
Contributor

@blueorangutan test matrix

@blueorangutan
Copy link
Copy Markdown

@andrijapanicsb a Trillian-Jenkins matrix job (centos7 mgmt + xs71, centos7 mgmt + vmware67, centos7 mgmt + kvmcentos7) has been kicked to run smoke tests

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-1296)
Environment: xenserver-71 (x2), Advanced Networking with Mgmt server 7
Total time taken: 27432 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr3973-t1296-xenserver-71.zip
Intermittent failure detected: /marvin/tests/smoke/test_privategw_acl.py
Intermittent failure detected: /marvin/tests/smoke/test_scale_vm.py
Smoke tests completed. 75 look OK, 2 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
test_02_vpc_privategw_static_routes Failure 273.81 test_privategw_acl.py
test_03_vpc_privategw_restart_vpc_cleanup Failure 210.71 test_privategw_acl.py
test_04_rvpc_privategw_static_routes Failure 333.89 test_privategw_acl.py
test_01_scale_vm Failure 13.31 test_scale_vm.py

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-1297)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 28532 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr3973-t1297-kvm-centos7.zip
Intermittent failure detected: /marvin/tests/smoke/test_privategw_acl.py
Intermittent failure detected: /marvin/tests/smoke/test_vm_life_cycle.py
Smoke tests completed. 76 look OK, 1 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
test_02_vpc_privategw_static_routes Failure 206.10 test_privategw_acl.py
test_03_vpc_privategw_restart_vpc_cleanup Failure 171.89 test_privategw_acl.py
test_04_rvpc_privategw_static_routes Failure 237.42 test_privategw_acl.py

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-1298)
Environment: vmware-67u3 (x2), Advanced Networking with Mgmt server 7
Total time taken: 30852 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr3973-t1298-vmware-67u3.zip
Intermittent failure detected: /marvin/tests/smoke/test_privategw_acl.py
Smoke tests completed. 76 look OK, 1 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
test_02_vpc_privategw_static_routes Failure 259.31 test_privategw_acl.py
test_03_vpc_privategw_restart_vpc_cleanup Failure 274.54 test_privategw_acl.py
test_04_rvpc_privategw_static_routes Failure 408.08 test_privategw_acl.py

@DaanHoogland DaanHoogland merged commit d93c245 into apache:4.13 Mar 25, 2020
@DaanHoogland DaanHoogland deleted the systemd-file branch March 25, 2020 18:57
DaanHoogland added a commit that referenced this pull request Mar 25, 2020
* 4.13:
  systemd dependency on db (#3973)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants