TrigMC software release coordination: PTD's logbook - page for June 2018

Notes

Use reverse chronological order.

For each shift date show the NEW MRs, and any related relevant comments/questions; note down when I (or some one else) merged/closed each request, and mark the MR as done/closed by ticking the box ([x]). If an MR takes longer than a day to merge, leave it listed under the original shift date, but add comments prefaced with the date of the comment/action. Only MRs that are still active should have the box unticked ([ ]).

Therefore each MR will only appear once in the log, under the date of the day when I first became aware of it.

Active requests

The active (i.e. not yet merged) requests at the moment are 0 (from most recent to oldest): none.

29/6/2018

[x] !12123 Sweeping !11334 from 21.0 to 21.0-TrigMC. Remove MA27 and clean up private member variables (opened by Atlas Nightlybuild)

8.51am Pipeline failed (about 6h ago). Wait till it succeeds.

5.45pm Restarted the CI pipeline by posting "Jenkins please retry a build"; this restarted the pipeline within seconds.

30/6/2018 6.27pm Pipeline failed, on different steps from last time, so probably not related to the build itself. Requested a restart of the build.

2/7/2018 7.31am Yet another build has failed (requested by Carl Suster, a few hours ago), due to "externals".

2/7/2018 4.46pm Once again, the build has failed, due to externals (the 3rd one in a row with this failure mode, even though the first build - prior to these 3 - had "externals" OK, and failed for a different reason...)

3/7/2018 7.30am Requested a new build: started within a minute.

3/7/20187.48am Pipeline failed, on the first step (externals).

4/7/2018 9.31am Kira Abeling has posted that she thinks she has identified the reason behind the CI pipeline fails. (Something to do with fastjet tools.) Asks that we wait for further news before attempting another build.

10/7/2018 10.20am (CERN) John Meshreki (L1) asks if there is an update on the fastjet tools hangup that was believed to be holding back the CI for this MR. This was indeed one of the issues that affected 12357 which is now merged, therefore fastjet tools issue must be resolved (?). Will request a rebuild now.

10/7/2018 10.26am CI pipeline started.

10/7/2018 9.17pm CI pipeline failed after running 8h39m, on "required tests". Problem is clearly not related to fastjets tool anymore (that would be under "externals", which is OK). As far as I can tell, pipeline failure seems to have been due to a problem in a connection to a database during the q221 and q431 tests (rather than the tests themselves). Assuming this was a spurious problem; will retry a build again.

10/7/2018 9.21pm Pipeline has started.

11/7/2018 (CERN) 9.00am Pipeline still running (7h49; note that pipelines do sometimes go into "pending"/paused mode for significant chunks of time, eg when waiting for resources).

11/7/2018 11.30am Pipeline has succeeded about 1h ago, after running for 9h34. Merged OK.


[x] !12390 Fix bug leading to creation of multiple empty VertexCollections for events with <2 FTK tracks (opened by John Baines)

8.54am CI pipeline is running ("triggered" (i.e., started?) about 21h ago). John has tested the new code (more details given in the GitLab page). I have set it to merge automatically once the pipeline succeeds.

10.08am Pipeline has finished successfully, and automatic merge went OK.


[x] !12395 ATLINFR-2503: Update AtlasExternals version to 1.0.33 in 21.0-TrigMC (opened by Oleg Kuprash)

8.57am CI pipeline is running (triggered about 19h ago).

12.36pm CI pipeline succeeded 1h ago. Merged it just now: all OK.

28/6/2018

[x] !12181 Sweeping !11860 from 21.0 to 21.0-TrigMC. 21.0 clean licenceandcopyright [MS] (opened by Atlas Nightlybuild)

5.15pm CI pipeline running. Will merge once it succeeds.

29/6/2018 8.44am Pipeline succeeded about 10h ago. Merged OK.


[x] !12183 Sweeping !11813 from 21.0 to 21.0-TrigMC. Updates and optimisation for the pattern bank production (opened by Atlas Nightlybuild)

5.18pm CI pipeline running. Will merge once it succeeds.

29/6/2018 8.45am Pipeline succeeded about 12h ago. Merged OK.

27/6/2018

[x] !12357 Update trig mc to 21.1.31 and.21.0.73 (opened by Julie Hart Kirk)

11.35am CI pipeline has failed (about a min. ago). Wait for it to re-run.

28/6/2018 10.52am CI pipeline running.

28/6/2018 5.09pm Pipeline has succeeded a few hours ago. However, Stan Lai (@slai) posted comment, stating that further review is needed (at level 2). I will wait for that review to happen, before I Merge, and have posted a comment to that effect.

29/6/2018 2.55pm CI pipeline has succeeded, but can't merge due to a "There are merge conflicts" message. posted a comment (to the attention of T Martin) to that effect.

29/6/2018 3.44pm Tim has replied to my comment ("Ah yes, think that has to be one for @hartj next week, standard git fetch upstream; git merge upstream/21.0-TrigMC should reveal the conflict.") It sounds like it is up to the original requester (Julie Kirk) to sort out the merge conflict; apparently will happen next week.

2/7/2018 12.01pm In the last hour, Julie has made 12 new commits, which will hopefully address the "merge conflicts" issue. Pipeline is showing as "pending". Will wait for it to re-start and conclude successfully before merging.

3/7/20018 7.26am Pipeline failed again (externals OK, though; could just be a resource/timing issue?). Requested a build - pipeline running again within a minute.

3/7/2018 9.46am Tim has posted that another MR needs to go through, to fix the issue with pipeline. ("Needs !12484 to be merged first to fix CI"). !12484 is labelled as "urgent"; see July 2018 logbook page.

3/7/2018 1.34pm I have now merged !12484; this should hopefully resolve the CI failures for this MR.

4/7/2018 7.28am The latest CI pipeline for this MR started about ten minutes ago.

4/7/2018 12.16pm CI pipeline has failed just now: due to "make"; all other steps (externals, cmake, required tests and optional tests) were OK.

5/7/2018 8.29am CI pipeline failed again; Julie thinks that it might be because it timed out after ten hours (the log shows it ran for 10h:00m:10s...!), and has started a new build.

6/7/2018 7.38am CI pipeline succeeded about 1h ago. Merged OK. Yay.

25/6/2018

[x] !12336 Sweeping !12026 from 21.0 to 21.0-TrigMC. rm FullChainTests (opened by Atlas Nightlybuild)

10.47am Pipeline still running. Waiting for it to complete before merging.

2.00pm Pipeline failed.

26/6/2018 7.30am Pipeline still showing as failed.

26/6/2018 11.06am Pipeline is now re-running. (Someone triggered it in the usual way by entering the comment "Jenkins please retry a build".)

26/6/2018 2.09pm CI pipeline failed (again) about an hour ago.

26/6/2018 6.11pm CI pipeline running again. I have set it to merge automatically when pipeline succeeds.

27/6/2018 11.34am CI pipeline succeeded yesterday evening at 20.08, and was automatically merged then.

16/6/2018

[x] !12172 Sweeping !12027 from 21.0 to 21.0-TrigMC. remove memleak test (opened by Atlas Nightlybuild)

8.50pm Merged OK.

15/6/2018

[x] EventOverlayJobTransforms-00-06-32-11 (requested by Andrzej Olszewski)

12.45pm Andrzej Olszewski requests updating the TrigMC-20.7.9.8.8 cache with the EventOverlayJobTransforms-00-06-32-11 tag. The only changes wrt the previous tag are config files; Andrzej has validated the changes. He thinks it is therefore safe to proceed to a build immediately, which he is requesting.

1.08pm I have added the new tag to the cache and marked it as "accepted". Have also emailed atlas-sw-release-shifters@cern.ch requesting a build without waiting for the next nightly.

17/6/2018 1.30pm Installation and distribution happened yesterday. Have now notified all concerned that the new release is available.

9/6/2018

[x] !11925 Sweeping !11771 from 21.0 to 21.0-TrigMC. 21.0 clean licenceandcopyright [InDet] (opened by Atlas Nightlybuild)

8.33pm Merged OK.

4/6/2018

[x] EventOverlayJobTransforms-00-06-32-10 (requested by Andrzej Olszewski)

Andrzej Olszewski has asked for an update to TrigMC-20.7.9.8.7, namely to update to EventOverlayJobTransforms-00-06-32-10 urgently and request a build immediately.

This cache is still managed with the old Tag Collector system. The last activity on it seems to have been in November 2016, with a couple of tags added by Tulay Donszelmann. I have checked that even if the latest tag available is -00-06-39, the one currently in the cache is -00-06-32-09 (so the update probably does not involve a huge amount of change wrt the current version). I have checked the nightlies (still being built every night) and there is nothing unusual there.

9.20am The request came from Andrzej on Friday night, but I only saw it now. He seems to request that we progress to a build without going through the usual nightly test. I have emailed him to check that this is the case, before accepting the tag and requesting the build, as there is always a risk that the build has problems otherwise.

9.36am Andrzej has replied to my email: he is happy to wait for the nightly and have the build tomorrow, if the nightly is OK.

5/6/2018 10am Checked the nightly (seems OK) and mailed Andrzej to invite him to look at the results and confirm that he wants me to go ahead with making a release out of this nightly.

6/6/2018 10.53am Andrzej has just now replied to my message and confirmed that, having looked at the test results, he is happy to go ahead with the release.

6/6/2018 11.03am I have accepted the tag into the cache, and emailed atlas-sw-release-shifters@cern.ch requesting that they make a release out of the rel_2 cache (yesterday's nightly) and install it on the Grid.

6/6/2018 11.58: Oana Vickey Boeriu (release shifter) has confirmed tha the release has been built (see details in the usual place, here), and that installation on the Grid is in progress.

6/6/2018 3.00pm (Grid deployment started 1h30 ago) Emailed developers and atlas-trig-relcoord@cern.ch to let them know that the release is now available and deployed on the Grid.


-- PedroTeixeiraDias - 28 Jun 2018

Edit | Attach | Watch | Print version | History: r24 < r23 < r22 < r21 < r20 | Backlinks | Raw View | Raw edit | More topic actions

Physics WebpagesRHUL WebpagesCampus Connect • Royal Holloway, University of London, Egham, Surrey TW20 0EX; Tel/Fax +44 (0)1784 434455/437520

Topic revision: r24 - 17 Jul 2018 - PedroTeixeiraDias

 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding RHUL Physics Department TWiki? Send feedback